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Sir: 

I, Lawrence A. White, hereby declare that: 

I. 

II. Introduction 

1 . 1 am a resident of Duluth, GA and have been working in the field of Information 
Technology for 25 years. 

2. A copy of my Curriculum Vitae is attached hereto as Exhibit A. As evident from 
my Curriculum Vitae, 1 am fully familiar with computer-aided design (CAD) programs and 
products. Also, in my career I have designed and implemented Emergency and Disaster 
Management Centers, both domestically and internationally, A sampling of these centers and 
emergency management agencies include Department of Homeland Security, Urban Area 
Security Initiatives (DHS UASI), St. Louis Area Regional Response System (STARRS), New 
Jersey State Police, State of Missouri Emergency Management Agency and Gauteng (South 
Africa) Provincial Disaster Management Centre. 
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3. I am currently the Chief Technology Officer ("CTO") of Archaio, LLC 
("Archaio"), the assignee of the present patent application. I have held this position since 
January 2009. As the CTO of Archaio, I am familiar with the various products that have been 
developed and that are currently under development. I am also familiar with the marketing and 
sales of the various products that have been developed. 

4. I am familiar with U.S. Pat. App. No. 10/629,347, entitled "Systems and Methods 
for Providing True Scale Measurements for Digitized Drawings," hereinafter referred to as the 
Patent Application. I am also familiar with the development of the various products set forth in 
the Patent Application. The design and development process, as well as the later commercial 
success, of the various products described in the Patent AppUcation are briefly described below. 
In particular, the considerations for embedding scale information in the header of a raster file are 
discussed. 

III. Materials Considered 

5. I have reviewed and am familiar with the office action dated February 17, 2009 in 
the Patent Application and the prior art references U.S. Patent No. 6,134,338 to Solberg 
("Solberg") and with U.S. Patent PubUcation No. 2002/0077787 to Rappaport, et al. 
("Rappaport") applied by the Office against the claims of the Patent Application. I have also 
reviewed and am familiar with the set of amended claims submitted to the U.S.P.T.O. on August 
17, 2009. 

IV. Process for Determining Whether Claims are Obvious 

6. Having read the office action and the prior art references I have formulated the 
opinions provided below regarding obviousness. 

V. Ordinary Skill in the Art 

7. I believe the hypothetical person of ordinary skill in the art during the 2003 
timeframe would have been a person having an undergraduate degree in computer science and 
post graduate studies in management information systems that has practical or academic 
experience in the field of computer programming. In the 2003 timeframe I was employed with 
IBM Corporation as a Global Services Business Consulting Principle responsible for sizing. 
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scoping, and delivering advanced technology services to the U.S. Federal Marketplace, with a 
focus on enterprise system management and emergency response systems for command and 
control for military and intelligence agencies. 

VI. Overview of the Patent Application 

8. It is estimated that approximately 85% of the world's infrastructure plans, such as 
architectural, mechanical, structural, and electrical diagrams, are currently stored in paper format 
only. The digitizing of paper infrastructure plans into a raster image, known as raster files, are 
recognized for providing the benefit of being good for archiving, printing and sharing. (See 
Exhibit B and Exhibit C.) However, despite the benefits of raster files and the fact that raster 
files are non-proprietary, the industry largely only utilizes raster files for archiving or 
transferring because making considerable modifications to raster files are found to be tedious. 
(See Exhibit C.) 

9. At and around the time of the creation of the present Patent Application, the 
industry concentrated on converting or transforming raster files for use with computer-aided 
design (CAD) programs as CAD has the known ability to precisely describe, create, scale and 
manipulate individual objects. (See Exhibit C and Exhibit D "[v]ector files can be scaled, which 
means one can zoom in on the details of a drawing. Also, they are more easily edited than raster 
files.") 

10. Again despite the benefits of raster files, the industry clearly diverged or taught 
away from using raster files for reasons other than archiving, printing, sharing or making minor 
modifications. (See for instance Exhibit C and Exhibit B which finds that "[r]aster images can 
only be edited by adjusting the values of individual dots.") The industry deemed that vectors 
could be made to be mathematically perfect, while rasters could not. (See Exhibit C.) Thus, CAD 
was understood to be the standard in technical drafting notwithstanding the fact that CAD 
programs typically utilize their own propriety format and therefore are often non-transferable 
between different systems and programs. (See Exhibit H.) Another problem with CAD 
programs, such as and including Solberg, is the fact that scale information is stored in external 
library files. If the library file is not available or if the library file becomes corrupted, the scale 
information may be unavailable when a raster file is accessed. 
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1 1 . The inventor of the Patent Application however recognized a clear, unaddressed, 
and long felt need for developing non-proprietary systems and methods to electronically store 
infrastructure plans in an accurate, scaled and secure manner, and share infrastructure plans with 
data integrity. 

12. Traditionally when paper plans are scanned and digitized for electronic storage, 
the images original physical size, and therefore the corresponding usefulness of the image scale, 
of a particular document is no longer a concrete attribute of the image. For example, if a paper 
version of an infrastructure plan is thirty inches in height and forty inches in width and then 
scanned, a computer user of that scanned electronic image would see the document as a different 
physical size when using different monitors depending on the size of the display device and its 
own pixel resolution. Thus, the scale that appears on the document (e.g., one eighth inch equals 
one foot, etc.) will be incorrect when an electronic depiction of the document is displayed on a 
computer monitor. This is because the original physical size of a paper image has no direct 
correlation to the pixel dimensions of a computer monitor. As a result, a 20 inch wide monitor 
can only display an image as twenty inches wide if viewing the whole image and a twenty-five 
inch wide monitor can only display an image as twenty-five inches wide if viewing the whole 
image. Also, neither monitor would be able to display the whole image as it originally appeared, 
that is, as a forty inch wide image. The user has no way to know what the original physical size 
of the paper drawing was, yet the scale ratio of the image listed on the plan is directly tied to the 
physical size of the original paper document. So if a computer user viewing the scanned 
infrastructure plan on a twenty-five inch monitor tried to take a physical measurement of the 
image on the computer monitor using that data with the image scale to manually compute a true 
scale measurement the result would be a wrong measurement value. Furthermore zooming the 
image so that only portions of the original image appear on the computer monitor also distorts 
the physical size of the image making any physical measurement of an image or image element 
not useful when combined with scale to calculate a true scale dimension measurement. In 
essence, once a paper drawing is scanned, the scale information on that drawing is no longer 
valid and accurate when a digital version of the paper drawing is viewed on a monitor or display 
device. 
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13. In response to the long felt need for systems and methods to electronically store 
infrastructure plans while maintaining true and accurate scale information and the need to 
permanently secure scale to digitized plans, the inventor of the Patent Application embeds scale 
information in a header of a digitized raster image of an infrastructure plan. The inventor of the 
Patent Application employs a specific private Tagged Information File Format (TIFF) header tag 
from Adobe Systems to store scale information. 

14. This private header tag is a dedicated location that permits scale information to be 
secured indefinitely within a header of a raster image. As a result of embedding the scale 
information within the header of a raster file, the digital raster image and all the data needed to 
calculate dimension measurement data at a future time can be stored as a single file. Also, as 
there are hundreds and thousands of tags the embedding of scale information in the dedicated 
location permits quick and easy access to such scale information rather than having to 
unnecessarily delve through the thousands of tags in the TIFF header. 

15. The act of embedding scale information within a header of the digital raster 
image, using the dedicated private header tags and storing the digital raster image as a single file 
are important aspects of the Patent Application. 

16. The Patent Application, by way of these important aspects, provides a raster file 
that secures the scale information to the single digital raster image as it is known that integrity of 
a single raster file can better maintained than that of multiple raster files, which easily 
disassociate from one another. Also, the Patent Application prevents scale information that is 
embedded in the header of a raster image from being lost or overwritten as it is known that the 
information stored in a raster header location is lost or overwritten much less than the data kept 
in the main body of a file. Further, raster images are by their nature not able to store data other 
than pixel data in the main body of the electronic file. 

17. The file of the Patent Application can be easily transferred between different 
systems and software programs and stored to be readily available for subsequent access. 
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18. The file of the Patent Application can also be quickly opened and the embedded 
scale information quickly accessed in order to determine true scale measurement information 
upon subsequent access of the digitized raster file. 

19. For example, an architectural drawing may be converted to a digitized raster file 
and scale information can be embedded in the header of the digitized raster file. Once the 
digitized raster file is rendered, a user may draw a line or shape in the rendered architectural 
drawing, and true scale measurements for the drawn line or shape (e.g., distance or area) may be 
determined utilizing the embedded scale information. 

20. By creating the single digital raster image file with scale embedded in the 
dedicated location of the header of the single file, the inventor of the Patent Application provides 
a non-proprietary electronic file format that is readily available for use by a wide variety of 
different individuals. Users of the Patent Application are not required to be familiar with 
sophisticated software programs and products, such as computer-aided design (CAD) programs 
and products. Thus, unskilled users may be able to quickly and efficiently utilize the claimed 
invention with ease. 

21 . A wide variety of dimension data can be calculated using only the raster image 

with the scale data embedded in the raster image header. By using the present Patent 
Application, measurements can be calculated from input that is not previously prescribed. For 
instance, a line can be drawn on a raster image from the middle of a wall in a room to any other 
point in the room and a true scale measurement of the drawn line can be obtained using the scale 
embedded in the dedicated location of the digital raster image. These non-prescribed 
measurement values are not listed on the original image as it would be impossible to list all 
possible element measurements and combination of element measurements on a paper drawing 
image. 

22. Also, a wide variety of different applications may utilize the systems and methods 
set forth in the Patent Application. Some example appUcations include Emergency Management 
and Response applications. (See Co-pending Application No. 11/068,268). Raster images of 
building schematics may be utilized to calculate tme scale measurements for the movement of 
emergency personnel within the building. Given the location of emergency personnel (e.g., 
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firefighters, etc.) within the building, a route for the personnel and true scale measurements 
associated with the route may be determined. The emergency personnel may then be provided 
with accurate directions while inside the building. 

VII. The Solberg Reference Alone Or Combined With The Rappaport Reference 
Does Not Render Obvious Claims 1-2. 4-20. 

A. The Solberg Patent Fails To Teach Claimed Elements 

23. Solberg discloses a CAD program that utilizes raster files as intermediate files in 
the creation of a complex CAD file. (See Solberg Col. 22, lines 63-65). Specifically, Solberg 
scans a paper document and stores pieces of the original paper document as multiple raster files 
that are then transformed into a CAD file, a second, associated, but clearly distinguishable 
electronic file entity. (See Fig. 2, Steps 3.3 and 3.4, Col. 25, lines 3-7 and Fig. 4 Col. 26, lines 
15-25.) Solberg's invention is designed to automate the transfer of 2D raster image data 
concerning real world objects into mathematically accurate 3D vector models. (See Solberg 
Abstract and Col. 40 Unes 6-14). 

24. I agree with the statement in the office action finding that Solberg does not teach 
embedding scale in the header of a digital raster image. As mentioned above, CAD programs, 
such as and including Solberg, preserve scale information in an external library file, which if 
corrupted or separated fi-om the raster file will result in the loss of the scale information. Scale 
information may also be read firom a scanned document utilizing optical character recognition 
techniques or keyed in upon prompting, however again the scale information is not embedded in 
a raster image. See Solberg col. 14, lines 25-54, col. 19 line 45 to col. 20, Une 8 and Col. 16 
lines 36- col. 17 line 16). Furthermore, in view of the set of amended claims submitted on 
August 17, 2009, 1 believe that Solberg does not teach embedding scale in a dedicated location of 
a header of a digital raster image. 

25. Solberg also does not teach storing the digital raster image as a single file with 
scale embedded in the header of the single digital raster image file as presented in the set of 
amended claims submitted on August 17, 2009. The Patent Application preserves the original 
raster bitmap format as a single document and simultaneously preserves scale. In contrast. 
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Solberg teaches the creation of a plurahty of viewpoints raster files and discloses creating a 
separate raster file, a floating viewpoint 242, which contains scale 187. (Solberg col. 18 line 61- 
col. 20 line 14.) This scale information is a separate and distinct raster file from the raster file(s) 
used to store the plurality of viewpoints 122. (See col. 14, lines 25-54 and col. 19 line 45 to col. 

20, line 8). 

26. Solberg fails to teach associating scale with raster files, rather Solberg teaches 
associating scale with a CAD file. After a CAD viewport is selected in steps 3.1-3.2, the 
AUTOCAD program prompts the user for the drawing scale in step 3.3 and then later the raster 
file is imported into the AUTOCAD program (See Solberg Col. 25, lines 9-14). Clearly the scale 
has not been set for the raster file as it had yet to be imported. Scale may also be read from the 
alphanumeric text representing scale information on the face of the raster image once in the CAD 
program. (See Solberg at Col. 16 lines 36- col. 17 Une 16.) Thus, Solberg is not associating scale 
with a raster image and not embedding scale in a dedicated location of a header of a single raster 
image. 

27. Furthermore, Solberg does not teach calculating a true scale measurement of a 
drawn line or shape based at least in part on the embedded scale information in said dedicated 
location of said header of said single file. Solberg instead teaches converting hard copy 
drawings into mathematically accurate vectors corresponding to physical dimension and edges of 
3D objects and moiety that symbols represent. (Abstract, col. 13 hnes 40-45). The result of 
Solberg is to use the vector file as a 3D computer model of the 3D object and the moiety 
represented by a symbol. (Col. 40, lines 6-14.) Also, the mathematically accurate AUTOCAD 
drawing file may be printed to create a new hard copy of the newly created converted 
engineering drawing. (Solberg col. 57 lines 1-15.) 

28. Solberg' s creation of mathematically accurate vectors traced over a previously 
drawn Une does not teach the calculation of a true scale measurement of drawing input. (See 
Solberg Step 6.) Solberg simply correlates dimension information to a shape. This dimension 
data that is associated with a line is accessed firom the CAD library file for display. Thus, 
Solberg is neither accessing scale data from a header of a raster image nor utilizing scale data 
stored in the header to calculate a true scale measurement of a drawn line or shape. It is 



6483195.3 



ARCl 1.012 



understood that the measurements displayed by Solberg are Umited to the prescribed dimensions 
shown on the face of the image. 

B. The Rappaport Reference Fails To Teach Claimed Elements 

29. Rappaport looks to provide an admittedly non-scaled contextual map for the 
association of external device or tool collected metric data wherein the metric data may be 
visually interpreted and associated against the image back drop of the spatial environment it 
describes. (Rappaport abstract, [0025] and [0075].) Rappaport is designed as a means to store 
measured network performance where a measurement reading is associated with some textual or 
graphical identifier to enable easy inspection or analysis of data by anyone especially by a less- 
technical or untrained individual. (Rappaport [0069].) 

30. The office action asserts that Rappaport teaches embedding information in a 
header of a digital raster image. Although Rappaport teaches the known concept of storing file- 
identifying information in the header of a file such information is metadata or generic 
information. Metadata is descriptive data about data retained within the body of the computer 
file. (See Exhibits E, F and G.) It is a key component of data lineage as it provides basic 
information about the source and derivation of a data set. (See Exhibit F.) Metadata includes 
information that describes file content such as a file name or a file type; quality; condition and 
other appropriate characteristics of the data. Metadata provides the necessary information for an 
application to "recognize" and "understand" the file, see Exhibit E, and also, this type of 
information is used "to properly transmit" a file. (Rappaport [0095] and [0010].) 

31. In contrast to the known concept of inserting file-identifying information (file 
name, date, comments, etc.) in the header, the Patent Application embeds specific image 
information, scale information, in a dedicated location of a header of a digital raster image that 
unlocks additional functional possibilities. 

32. Also, Rappaport does not teach embedding scale in a digital raster file. Rappaport 
teaches putting information in the header of a generic computer file. The difference between a 
header of a generic computer file and a header of a digital raster image are significant and well- 
known. A header of a generic computer file has a fixed role with limited capacity. A generic file 
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header has 5-10 slots/fields and is used to hold limited information such as metadata. The file 
header of a generic computer file is not accessed for reasons other than obtaining transmission 
information. (See Rappaport [0010], [0095], [0116] and [0124].) Also, a generic electronic file 
header can only be read and properly processed if it is in a proprietary or standard file exchange 
format. Rappaport does not teach a generic electronic file header that is in a proprietary or 
standard file exchange format. 

33. In contrast, the Patent Application employs the header of a raster file. The digital 
raster header is known to have an exponentially larger capacity than the limited capacity of a 
generic file header to hold information. A header of the digital raster image can have an 
unlimited number of Private tiff Tags that can be expanded all the way up to the TIFF file size 
limit of 4 GB at 8 bytes per tag. Also, computer systems are designed to intelligently process the 
information retained in the header of the TIFF file, a non-proprietary interchange format. This 
format can be universally read and processed without limitation. The act of embedding scale 
information in a file header securely stores the scale information in the raster file. Thus, 
Rappaport' s teaching of placing metadata in a header of a generic computer file does not equate 
to or render obvious the embedding of scale in a header of a digital raster image. 

34. As for the assertion in the office action that scale could be put in the header via 
the notes subsection, this is not viable. It is known that the notes subsection of the header has 
limited capacity and is used for general purposes such as to retain metadata and other limited 
information. (See Rappaport [0116] and [0124].) Rappaport clearly teaches using the "notes" to 
retain metadata such as comments. See for example the "notes" line in Fig. 3 of Rappaport which 
states "Notes: Lxjcation file for Blacksburg Office." This conmient placed in the notes subsection 
of Fig. 3 is nothing more than metadata. Furthermore, a computer system cannot intelligently 
process information from the notes subsection of a computer file; rather the computer only is 
able to consume information fi-om the notes subsection as general comments or notes. 

35. Rappaport' s teaching of inserting metadata or file-identifying information in the 
notes subsection of a header does not render obvious the embedding of scale information in a 
dedicated location. A dedicated location in the header of a digital raster image has an address and 
a location that retains specified information. A computer can intelligently process information 
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Stored in a dedicated location. The insertion of comments in the notes subsection is not equal to 
embedding scale in a dedicated location. Thus, it would not be obvious to insert scale 
information in the notes subsection of a header as information in the notes subsection is ignored 
and not processed by a computer. 

36. For the reasons stated above the combination of Solberg with Rappaport would 
not be feasible and thus would not render obvious the invention of the Patent Application as 
claimed. 

C. The Claimed Combination Of Rappaport With Solberg Changes The Principle 
Operation Of Solberg, The Primary Reference, And Renders Solberg Inoperable 
For Its Intended Purpose 

37. The office action finds it would have been obvious to combine Solberg's teaching 
of manually entering scale with Rappaport' s teaching of placing information in the notes 
subsection. Specifically, the office action asserts scale could be placed in the notes subsection of 
a header of a digital raster image. However, such a combination would change Solberg's 
principle operation and render Solberg inoperable for producing mathematically accurate 3D 
vectors. (Solberg-Abstract). 

38. Solberg sets scale for the CAD system and this act is accomplished in Step 3.3 
("Set drawing scale"). (Solberg Col. 25 lines 1-14.) The office action's suggested combination 
would in effect be replacing Solberg's scale setting step (Step 3.3), as scale would now be set 
using Rappaport. This combination would result in Solberg's CAD system not being able to read 
the scale as it is known that CAD systems consume information from the body of the file and not 
from the header of a file. Thus, the placement of scale in the header of a file would change the 
principle operation of Solberg as its Step 3.3 would be replaced. As a result, Solberg would be 
rendered inoperable for its intended purpose of attaining mathematically accurate 3D vectors. 

39. The replacement of Solberg's scale setting step with Rappaport's teaching would 
further change Solberg's principle operation as Solberg's CAD system would not be able to read 
scale from the header of a file and thus would have to resort to a default scale. It is known that 
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CAD programs use a default scale of a 1:1 ratio, stored in their own proprietary data store 
format, to perform mathematical calculations and measurements. The presence of a default scale 
in CAD programs was demonstrated in the Personal Examiner interview held on July 20, 2009. 
In the CAD demonstration, a true scale measurement of a drawn line was not attained as the 
CAD program used a default scale. Again, Solberg would not achieve its intended purpose of 
attaining mathematically accurate vectors. 

40. The office action's suggested combination of placing scale in the header of the 
digital raster image would further change Solberg' s principle operation as Solberg would no 
longer be setting scale for the CAD application. Solberg' s Step 3.3 of setting scale in the CAD 
system is an important and critical step as it is required in order to properly introduce the 
viewport raster file 124 into the CAD viewport 290. Step 3.3 is also important because the 
viewpoint raster file image 350 in the viewport 290 serves as a backdrop to the production of the 
vectors 268. (Solberg Col. 25 line 25- Col. 26 line 14.) If scale is not set using Step 3.3, then 
here again mathematically accurate vectors could not be produced and ultimately Solberg is 
rendered inoperable. 

41. It should be noted that Solberg's CAD program would not be able to consume 
information from the header of a file without substantially changing Solberg's principle 
operation. In order for Solberg's CAD system to access information from a file header the CAD 
program would require the computer file be in a format that is native to AUTOCAD, such as a 
binary proprietary format. There are two acceptable formats: a proprietary format and a 
supported file exchange format. The proprietary format must be written and read specifically by 
the CAD application. Solberg does not teach or feasibly suggest modifying CAD applications to 
support this process. The supported file exchange format is in a .dwg or .dxf format. Solberg 
does not teach .dwg format and only teaches .dxf as the file name for storing the accurate CAD 
drawing file. Only the proprietor of the CAD program can change how the CAD system 
consumes information. Solberg does not teach changing the CAD program to consume 
information from the header of a file, nor in reality is there any ability/force to change 
commercially available, proprietary CAD systems. 
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VIII. Examiner Fails to provide an Apparent Reason with Rational Underpinnings to 
Combine Solberg with Rappaport 

42. The office action asserts the motivation to combine Solberg with Rappaport "is to 
allow users to instantaneously interpret the measurement value (scale information) and allows 
one to understand or recall with ease the measurement type, measurement location, and etc." 
(See Rappaport [0074] and [0070].) 

43. The office action appears to inaccurately equate Rappaport' s term "measurement" 
to scale. Rappaport uses "measurement" to describe the performance of a network of distributed 
components (performance metrics) or characteristics of any collection of spatially distributed 
group of objects. For instance, Rappaport describes measurements with regard to networks such 
as communication network or a distributed infrastructure network for carrying power, heat, air- 
conditioning, fluids, and the like or to the physical observation about quality or quantity of 
objects such as furnishings of a room, quality of paint or inventory of equipment. (Rappaport 
[0076], [0096] and [0106].) Furthermore, Rappaport does not teach embedding scale. Nowhere 
in the description of Rappaport's invention is the term "scale" used. Rappaport's single and only 
use of the term "scale" is made in order to distinguish itself from a prior art reference.' 

44. Rappaport describes a visual graphical environment where real world objects and 
environments are represented as approximations with no concern for scale or the implications of 
securing this data. (Rappaport [0070], [0072], [0078] and [0092].) 

45. In contrast, the scale described in the Patent Application is a ratio that in part 
helps calculate spatial and physical measurements that is taken from the paper drawing. For 
example, scale is known as inches to inches in a document scale. (Patent Application [0032]- 
[0044].) I understand the difference between "scale", as claimed in the present Patent 
Application, and the dimension measurements that the scale helps calculate in the present 
invention and "measurement" as employed in Rappaport and find the two terms to be different 
and distinguishable. 



' In an electronic word search conducted on Rappaport, the term "scale" was only used once to 
distinguish a prior art reference from the Rappaport invention. (See Rappaport [0064]). 
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46. Also, the language of Rappaport cited as motivation to combine Solberg and 
Rappaport, when read does not provide an apparent reason with rational underpinning to 
combine the two references. 

47. User's of Rappaport's invention are able to instantly interpret the performance of 
a network by viewing cues or "graphical display[s] of measurement data" overlaid on the 
computer representation of the environmental-raster image. (Rappaport [0057] and [0124].) 
These icons or text strings overlaid on the raster image give context to the value or location of 
the measurement displayed on the raster image. 

48. In contrast, to providing cues to give context, the Patent Application is calculating 
a true scale measurement using scale information that is embedded in the header of the raster 
image. 

49. As stated above, the combination would not result in a single digital raster image 
with scale embedded in a dedicated location of the header of the raster file and true scale 
measurements could not be calculated therefrom as claimed in claims 1,6, 10 and 15 of the 
present Patent Application. 

50. Also, there is no reason to combine the two references since the combination of 
the two references would not result in success. As CAD systems do not consume information 
from the header of any computer file, there would have been any reason to have considered 
selecting scale information and storing such information in a header of a single digital raster 
image file. CAD programs store scale information in external library files that can be associated 
with a CAD file. Furthermore, a large amount of information is typically stored in the CAD 
library files, thus it would not be feasible to store all of the library file information in the header 
of a raster file. 

51. In addition, if Rappaport were combined with Solberg, the combination of 
references would likely result in Rappaport's textual strings and/or graphical icons associated 
with performance metrics being overlaid on a Solberg's intermediate raster file. Solberg would 
then, if possible, have to recognize and convert these texts and icons as accomplished in 
Solberg's later steps 4-7. Such a combination again would not result in the invention claimed in 
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claims 1,6, 10 and 15 of the Patent Application where a single digital raster image has scale 
embedded in a dedicated location of the header of the raster file and where true scale 
measurements are calculated using the embedded scale information. 

IX. Secondary Considerations 

52. Since development of the systems and methods set forth in the present Patent 
AppUcation, Archaio has achieved substantial commercial success. The inventions have led to 
multiple sole source contracts in the United States at county, state, and federal levels. In each 
case, the contracting vehicles point to a unique technology. Additionally, because no bidding 
process was utilized in any of the cases, Archaio had to demonstrate that no competitive products 
or offerings were available. Accordingly, I believe that the Archaio inventions are unique, novel 
and non-obvious. 

53. Moreover, as a result of the solution provided by the present Patent Application, a 
partnership has been established between Archaio and IBM to market products throughout the 
world. Due to the commercial success and potential market opportunities for various 
embodiments set forth in the present Patent Application, it is believed that a unique solution has 
been developed to satisfy the long felt need for systems and methods that efficiently store 
accurate scale infrastructure plans in an electronic format. 

54. During the development of the systems and methods of the present Patent 
Application, the industry was aware of and familiar with CAD programs and products. 
However, CAD programs were not incorporated into the invention of the present Patent 
Application as CAD programs typically utilize their own propriety format making CAD files 
often not transferable between different systems and programs. (See Exhibits H and I). Another 
concern with using proprietary formats is the possibility of the owner of that particular format 
going out of business thus leaving the format unsupported in the future. (See Exhibit I.) The 
Patent Application overcomes these concerns as it utilizes non-proprietary files with scale 
information embedded in a header. These non-proprietary raster files are easily transferable 
between different systems and software programs and readily available for use by a wide variety 
of different individuals. The files can be opened and the embedded scale information may be 
quickly accessed following scanning in order to determine true scale measurements. Also, CAD 
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programs were not incorporated into the invention of the present Patent Application as CAD 
programs store scale information in external library files thereby increasing the likelihood of 
information loss or disassociation. Accordingly, CAD programs were not considered when 
developing the invention of the Present Application. 

55. The Patent Application provides a single phase validation where, by embedding 
scale of the initial document in the header, measurements can be calculated by directly using the 
embedded scale information stored in the dedicated location of the header of a file. Thus, 
measurement results can be validated simply by confirming the accuracy of the scale embedded 

in the header. In contrast, the prior art inventions require a minimum 3-phase validation. For 
instance, in the Solberg invention validation can only be performed by a multi-phase validation 
of the Raster to Vector CAD conversion process. Solberg' s multi-phase validation includes 
storing scale in the CAD file, manually inputting the scale information within the CAD system 
and finally reviewing the source document. Furthermore, validation of the Solberg method is 
impossible to perform if the file has been shared and the party does not have access to the source 
document. 

56. The Patent Application provides a secure and efficient means of ensuring 
accuracy because measurements are calculated by directly using the scale embedded in the file, 
specifically in the dedicated location of a header of the file. Thus the Present Application ensures 
the calculation of accurate measurements even after file sharing and distribution. 

57. The Patent AppUcation provides an immediately intelligible raster file from which 
true scale measurement can be calculated. 

58. The U.S. Army has allocated $34 Million to CACI International, Inc. as a prime 
federal systems integrator to provide three-dimensional renderings of blueprints and architectural 
drawings that enhance the Army's ability to ensure comprehensive facilities support and 
protection. CACI has not allocated a vendor to the project and are in partnership discussions 
with Archaic as the present Patent Application and related applications are the only existing 
solution to quickly, efficiently and accurately produce rendering of blueprints and architectural 
drawings, viewable precisely to scale. 
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59. The present Patent Application provides a non-proprietary solution that provides 
several unique benefits over the proprietary CAD systems such as those non-Umiting examples 
of benefits cited above. The resultant non-proprietary file can be distributed and utilized by 
collaborating agencies, departments, and companies without the need to purchase a CAD system 
or invest in training and maintaining CAD users and environment. 

60. The ability to share and corroborate accurate information quickly is vital to our 
National Security and Emergency and Disaster Management Centers. There is an urgent and 
immediate need to quickly and efficiently prepare and respond based on a Common Operating 
Pictures (COP). The unique capabilities provided by the Patent AppHcation dehvers a 
comprehensive internal situation awareness of the COP which can be used to save lives and 
protect property. 

X. Attestation 

61 . I hereby declare that all statements made herein of my own knowledge are true 
and that all statements made on information and belief are believed to be true; and further that 
these statements were made with knowledge that willful false statements and the like so made 
are punishable by fine or imprisonment or both under Section 1001 of Title 18 of the United 
States Code and that willful false statements may jeopardize the validity of the application or any 
patent issued thereon. 

Date: August 17, 2009 




Lawrence A. White 
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Lawrence A White 



(404)424-3084 



2750 Towne Village Dr, DuluthrGA 30097 



lwhite@archaio. com 



Professional Profile 

Over 25 years of experience applying Information Technology to solve core business 
problems and processes. Highly skilled Solution Architect with significant experience with 
CAD development, CAD integration, Emergency Management, and Services Oriented 
Architectures (SOA). 

Consulting Experience: Processing Re-Engineering, Organizational Change Management 
Technical Skills: System Architecture, System Integration, High Availability, Disaster 



Software Engineering: C/C++, Object-Oriented Programming, Pascal, Fortran, Assembler 



Professional Accomplishments 

Gauteng Provincial Disaster Management and Emergency Operations Centre 

Lead consultant on the design, architecture, implementation and operational readiness of 
the Gauteng Provincial Disaster Management and Emergency Operations Centre in 
Midrange, South Africa, serving Pretoria, Johannesburg and the greater Gauteng 
Province. The Centre's focus is on disaster risk reduction and effective emergency 
preparedness of emergency services from all sectors. Including pre-planning of 
infrastructural developments, assessing capacity and identifying gaps of line functions to 
respond to major incidents. The Centre will provide real-time monitoring of critical facility 
and will also be used for the 2010 World Cup. The Centre provides effective response to 
major incidents and disasters that entail the development of effective response plans, 
ensuring the implementation of line function response plans, testing and training through 
desktop exercises of response plans, as well as, the recruitment and training of 
emergency service volunteers. 

St Louis Area Regional Response System 

Lead consultant on the design, architecture, implementation, operational readiness and 
deployment of the St Louis Area Regional Response System (STARRS). The STARRS 
system provides emergency response, collaboration and preparedness across eight 
counties and two states within DHS UASI region centric to St Louis. STARRS is design to 
support preparedness and response to critical incidents such as, natural disasters, 
pandemics, accidents and intentional acts like chemical, biological, radiological, nuclear or 
explosive (CBRNE) events. 



Management 



CAD Software: 



CEAL, AutoCAD, MicroStation, Intergraph, RDS, CADAM, CATIA 



New Jersey State Police 



Lead consultant on the design, architecture, and implementation of the State of New 
Jersey Emergency Prepardness Information Network (EPINet) OneStop. The New Jersey 
EPINet OneStop is a secure and personalized gateway providing simplified, integrated 
access to people, information, applications, and business processes for those involved in 
law enforcement, homeland security, emergency preparedness and response. The New 
Jersey EPINet OneStop is part of the EPINet program. EPINet is New Jersey's program 
for Implementing Information Technology solutions that span the Homeland Security Law 
Enforcement, Emergency Management and First Response communities of interest. The 
features include access to EPINet subsystems including: 

• E-Team - the New Jersey Statewide incident management system that provides 
situational awareness 

• RDDB - a database used to identify and catalog all the resources available within 
the State of New Jersey 

• EMV - a powerful tool for querying and visualizing spatial information 

• Searchable information clearinghouse for data, contracts, services, applications, 
and resources 

• Multiple profiles 

• Live chat/meeting capabilities 

• Support for multiple clients (desktop, web, mobile) 



IBM Corporation 

Engineering/Scientific Specialist supporting Desert Research Institute, Department of 
Defense Engineering Research and all major Engineering Research companies in the 
Western United States. Extensive CAD deployment, integration and innovation. 
Experience with CADAM, CATIA, MicroStation, AutoCad, and CEAL. 

US Federal 

IBM Executive responsibility expanded the IBM Federal Systems Division portfolio to 
include unique and innovative IBM technology services. 



Puerto Rico Highway Authority 

Lead consultant on the design, architecture, integration and implementation of a 
comprehensive engineering system for the Puerto Rico Highway Authority (PRHA). The 
project included all engineering department of the PRHA: Survey, Photogrammetry, Road 
& Highway Design, and Bridge Design. The core CAD components of the project were 
comprised of CEAL, Intergraph, CADAM, CATIA, and GTSUDL. 



Top 5 Engineering Firm and 17 Statewide Department of Transporation Agencies 

Lead engineer supporting the top 5 national engineering firms in the US and 17 US 
Statewide Department of Transportations (DOT). Responsible for development, 
deployment and integration of lead CAD innovations for the Civil Engineering Community. 



Major Awards and Certifications 



IBM Blue Ribbon Award 

CEO award presented by Lou Gerstner for significant contribution to tlie development of 
the IBiVi Global Services blueprint and revitalization of IBM. 

IBM Magic of Leadership Award 

General Manager award presented by Ginny Rometti to the top Global Services Account 
Executive Team that serves as a model of teamwork for IBM's top 50 integrated accounts. 

IBM Golden Circle Awards (multiple) 

Presented to the top 1 % performance for revenue, profit and growth contributions. 

IBM System Engineer Excellence Awards (multiple) 

Presented to the top 1 % performance for technical excellence. 

IBM Certified Senior IT Professional 

In 1995 Became the first IBM Certified Senior IT Professional. Was appointed a by Senior 
Management a member of the IBM IT Professional Certified Board. 

IBM Rookie of the Year 

IBM Sr. Vice President Award presented by Ned Lautenbach. Presented to the top first 
year contributor of approximately 20,000 professional hires and 30,000 college hires. 

HEEP (Highway Engineer Exchange Program) 

Multiple guest and keynote speaker engagements 

AASHTO (American Association of State Highway Transportation Officials) 

Multiple guest and keynote speaker engagements 

India Minister of Science & Technology Security Round Table 

One of 18 participates in a day long roundtable session on the role of l/T in National 
Security. Hosted in Dehli, India by the India Minister of Science & Technology. 



\Nork History 



CTO 



Archaic, L.L.C. 



01/2009 - Present 



VP Enterprise Planning 
Director, Public Safety Solutions 
Principle 

Engineering/Scientific SE 
VP Operations 



VirtualAgility 
Paaridlan Technologies 



IBM Corporation 
IBM Corporation 
CLM System, Inc. 



04/2007- 01/2009 
10/2004-04/2007 
11/1992-09/2004 
09/1989-10/1992 
08/1989-09/1985 



Education 



Computer Science 



University of South FL 



09/1980-12/1985 



Management Information Systems 



California Coast University 



01/1990- 12/1991 
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DEPARTMENT OF DEFENSE 



Additional Copies 

To obtain additional copies of this report, contact the Secondary Reports 
Distribution Unit of the Analysis, Planning, and Technical Support Directorate at 
(703) 604-8937 (DSN 664-8937) or FAX (703) 604-8932. 

Suggestions for Future Audits 

To suggest ideas for or to request future audits or evaluations, contact the Planning 
and Coordioation Branch of the Analysis, Planning, and Technical Support 
Directorate at (703) 604-8939 (DSN 664-8939) or FAX (703) 604-8932. Ideas and 
requests can also be mailed to: 

OAIG-AUD (ATTN: APTS Audit Suggestions) 
Inspector General, Department of Defense 
400 Army Navy Drive (Room 801) 
Arlington, Virginia 22202-2884 

Defense Hotline 

To report fraud, waste, or abuse, contact the Defense Hotline by calling 
(800) 424-9098; by sending an electronic message to Hotline@DODIG.OSD.MIL; 
or by writing the Defense Hotline, The Pentagon, Washiogton, D.C. 20301-1900. 
The identity of each writer and caller is fully protected. 



Acronyms 



ADCS Automated Document Conversion System 

CAD Computer-Aided Design 

CALS Continuous Acquisition and life-Cycle Support 

DLA Defense Logistics Agency 

DPS Defense Printing Service 

DUSD(L) Deputy Under Secretary of Defense (Logistics) 

DXF Data Exchange Format 

IGES Initial Graphics Exchange Specification 

JEDMICS Joint Engineering Drawing Management and Information Control 
System 

OASD(C^I) Office of Assistant Secretary of Defense (Command, Control 

Communications, and Intelligence) 

PC Personal Computer 




INSPECTOR GENERAL 




DEPARTMENT OF DEFENSE 
400 ARMY NAVY DRIVE 
ARLINGTON, VIRGINIA 22202-2884 



Report No. 96-153 



June 10, 1996 



MEMORANDUM FOR UNDER SECRETARY OF DEFENSE FOR ACQUISITION 



AND TECHNOLOGY 
ASSISTANT SECRETARY OF DEFENSE FOR 

COMMAND, CONTROL, COMMUNICATIONS AND 
INTELUGENCE 
DIRECTOR, DEFENSE LOGISTICS AGENCY 
DIRECTOR, DEFENSE INFORMATION SYSTEMS 
AGENCY 



SUBJECT: Evaluation of Automated Document Conversion Implementation 
(Project No. 6PT-5003) 



Introduction 

We are providing this report for information and use. By agreement with tiie 
Principal Deputy Under Secretary of Defense for Acquisition and Technology, 
we evaluated the concerns raised by Congressman Hunter, Chairman of the 
Subcommittee on Military Procurement, House Committee on National 
Security, and a contractor, AUDRE, Inc., regarding an Automated Document 
Conversion System (ADCS) for engineering drawings. AUDRE, Inc., was one 
of tiie vendors whose software the Defense Printing Service (DPS) assessed as a 
candidate for tiie DoD ADCS. The Chief Executive Officer of AUDRE, Inc. 
told Congressman Hunter that die Defense Logistics Agency (DLA) was not 
buying the AUDRE automated document conversion system even though DPS 
assessed it as the best candidate system. In the National Defense Authorization 
Act for Fiscal Year 1996, the conferees expressed their concerns that DoD was 
not making progress to achieve "major cost savings" through adopting the 
automated document conversion technology. Congressman Hunter spacifically 
questioned why the DoD was not using a previous $20 million appropriation to 
automate the conversion of engineering drawings. 



Evaluation Results 

We determined that: 

o limited demand exists for conversion of legacy hard copy engineering 
drawings to vector format. 

o The state-of-the-art in automated document conversion technology has 
not progressed to the level that allows an agency to convert a rasterized drawing 
into its vector equivalent without human intervention. 

o The DoD has developed sound policies and an effective strategy 
through which to implement autonuited document conversion in a cost-effective 
manner. 



o The DoD has prudently ejq)ended document conversion funds. 



Evaluation Objective 

Our objective was to evaluate the degree to which DoD has implemented 
automated document conversion of engineering drawings. Therefore, we 
examined: 

o whether demand exists for automated document conversion of 
engineering drawings, 

o the state-of-the-art in automated document conversion technology and 
whether the technology is cost-effective, 

o whether DoD has established automated document conversion policy 
that will ensure the cost-effective conversion of engineering drawmgs, and 

0 whether DoD has prudently applied automated document conversion 
funding. 



Scope and Methodology 

The scope of our evaluation included a review of document conversion within 
the Continuous Acquisition and Life-Cycle Support (CALS) Joint Engineering 
Drawing Management and Information Control System (JEDMICS) Program 
Management Office and at field agencies. We interviewed agency officials and 
observed agencies' document conversion activities. We also reviewed the 
Automated Document Conversion Master Plan published by the Office of the 
Assistant Secretary of Defense (Command, Control, Commimications, and 
Intelligence) (OASDLC^I]). We began the evaluation October 23, 1995, and 
completed it December 21, 1995. 

We attended the CALS International Expo 95. We visited the Office of the 
Deputy Under Secretary of Defense (Logistics) (DUSD[L]), who recommended 
we contact specific organizations that are mvolved first-hand in document 
conversion. We met With representatives from the JEDMICS Program 
Management Office; AUDRE, Inc.; the Oklahoma City Air Logistics Center; 
DPS at Port Hueneme, California; and the Naval Undersea Warfare Center at 
Keyport, Washington. We also met with the ADCS action officer from the 
Headquarters DPS Plans, PoUcy, and Technology Assessment Office 
(Enclosure 2). 

Three basic legacy document types within DoD are subject to conversion: 
technical publications, maps, and engineering drawings. Because the 
congressional concerns focus on the conversion of engineering drawings, we 
only evaluated the DoD efforts to convert legacy engineering drawings. 
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Prior Audits and Other Reviews 



Since June 1994, three Inspector General Reports have related to automated 
document conversion of data into digitized format. Enclosure 1 discusses the 
prior reports. 



Background 

DoD has millions of legacy weapon systems drawings in hard copy format. 
DoD uses some of these documents in weapon systems' upgrades and in the 
development of new weapon systems. To edit drawings for upgraded or new 
weapon systems, agencies must be able to use the drawings in computer-aideid 
design (CAD) systems. Before agencies can use these documents in their CAD 
systems, they must convert the documents to a digital format that a CAD system 
can edit. To convert these documents, engineers may have to spend numerous 
hours to manually trace a scanned image of a drawing or completely reconstruct 
a drawing. Congress believes that DoD is currently using thousands of 
workstations to convert legacy documents in this manner and that DoD is 
spending hundreds of millions of dollars a year to convert these documents. 

In FY 1994, Congress appropriated $14 million to the Defense Logistics 
Agency to competitively procure an ADCS. Congress believed that after 
scanning a document, an ADCS would eliminate the need for further human 
intervention in the conversion process. The ADCS could cut conversion costs 
by reducing the labor needed to convert documents. Therefore, agencies would 
be able to efGciently convert hard copy drawings or drawings on aperture cards 
(pimched cards on which a microfilmed document is moimted) into intelligent 
digital files using automated rather than manual methods. The data file output 
by the ADCS was to follow the standards of the CALS initiative. 

CALS is the DoD and industry technological initiative to integrate and use 
automated digital technical data for weapon systems acquisition, design, 
manufacture, and support. The objective of CALS is to facilitate the transition 
from the current paper-intensive weapon systems acquisition to an environment 
that provides for the generation, exchange, management, and use of digital data. 
An ADCS would provide a more cost-effective conversion of Imrd copy 
engineering drawings into this intelligent digital technical data instead of 
conversion through manual processes. DoD could then use this intelligent 
digital data to reduce weapon system acquisition times and costs. DoD has 
adopted a two-stage process to convert hard copy drawings to an intelligent 
digital format. 

The first stage of the conversion process is to scan the drawing to create a 
digital raster file of the drawing. A raster image is a bit-mapped representation 
of a drawing: a digital photograph. The degree of resolution is measured in 
dots per inch. Raster images can only be edited by adjusting the values of 
individual dots; therefore, they do not provide an editable CAD-ready file. 
They also require human interpretation as do hard copy drawings and drawings 
on aperture cards. However, raster files are good for archiving and for print- 
on-demand requirements. 
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The second stage of the conversion process converts the raster file to a vector 
format. A vector file is the presentation or storage of images as sequences of 
line segments. These data files consist of geometrically accurate and precise 
representations of the product, together with associated annotations such as 
dimensions and tolerances. Vector files can be scaled, which means one can 
zoom in on the details of a drawing. Also, they are more easily edited than 
raster files. Creating a vector file requires converting the lines and arcs in a 
scanned bitmap to the equivalent structure in a CAD system. Before automated 
docimient conversion systems, agencies were only able to do this conversion by 
manually tracing or reconstructing the drawing. The intent of the ADCS 
initiative was to offer a system to automatically (without human intervention) 
convert a scanned raster image to a vector format. 

Toward that objective, DLA sponsored a state-of-the-art assessment through the 
DPS in April 1994. DPS conducted this assessment to identify candidate 
vendor automated document conversion technology packages that convert 
technical publications, maps, and engineering drawings. The assessment 
included six vendors who offered products able to automatically convert 
engineering drawings to vector format. Of the six vendor products, DPS 
assessed the AUDRE Automatic Conversion System as the best candidate 
engineering drawmg ADCS. DPS would later test the AUDRE Automatic 
Conversion System for Congress. 

In FY 1995, Congress appropriated $20 million to DoD to implement the 
ADCS for engineering drawings Defense-wide. Congress also directed that the 
OASD(C'^I) establish and implement a master plan for all acquisitions of 
automated docimient conversion systems, equipment, and technologies. 

Between July and November 1994, DPS tested the AUDRE Automatic 
Conversion System. The test showed automation-assisted labor savings could 
result from using the AUDRE Automatic Conversion System to convert 
engineering drawings to vector format. However, DPS added that automated 
docimient conversion was not mature enough to completely replace trained 
engineers with production operators. After DPS tested the AUDRE Automatic 
Conversion System, it stored the tested AUDRE software packages at various 
DPS locations. However, these systems were not added to the DPS inventory 
for use in document conversion because of a lack of user requirements 
according to DPS. 



Discussion 

Demand for Automated Document Conversion. Our evaluation of automated 
document conversion began with determining whether the universe of 
documents eligible for document conversion contains any demand for 
vectorization of legacy hard copy engineering drawings. 

DoD estimates that as many as 100 million engmeering drawuigs exist. The 
universe of engineering drawings includes architectural engineering, electrical 
engineering, and mechanical engii^ering drawings. OASD(C^I) officials stated 
that no credible estimate of organization records eligible for conversion exists. 
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However, DoD estimates that only 10 percent of raster engineering drawings 
require conversion to vector format. See Figure 1. OASD(C^) officials 
indicate that the many technology-related marketing claims may have generated 
an artificial demand for conversion of DoD paper records to digital media 
formats. 




Figure 1. DoD Estimate of Rigineering Drawings Requiring Conversion to 
Vector Format 

None of the agencies we visited expressed a requirement to bulk convert 
documents to vector format; they implemented conversion requirements as 
needed. One agency even found bulk conversion of legacy hard copy 
documents to digital raster format to be uimecessary because it no longer uses 
many of the drawings. Therefore, agencies must first determine which of their 
legacy drawings in general they might use, if any, before they consider which 
drawings they might need to convert to a vector format. 

According to OASD(C^I), deciding which dociraients to convert is crucial to 
any conversion project. Military mission and business requirements and a 
business case thatt clearly e^qplains the fimctional and economic benefits 
anticipated fi:om conversion will gmde the conversion decisions. In addition to 
military mission and business requirements, these decisions include the timing 
of automated conversion and the justification for automated conversion. The 
decisions are delegated to responsible fimctional and DoD Con:q>onent officials. 
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Conclusion. Limited demand exists for conversion of legacy bard copy 
engineering drawings to vector format. 

Cost-effectiveness of ADCS Technology. To evaluate ADCS technology in 
the conversion of engineering drawings from raster to vector format, we 
examined current available conversion technology and the future of automated 
document conversion technology. We also looked at the minimum operator 
skill and knowledge required to provide a cost-effective conversion, the 
operating system and hardware platform requirements that will most cost- 
effectively use current DoD platforms within the engineering community, and 
which data format standards the ADCS should produce to provide a cost- 
effective digital file for use in the acquisition life-cycle. 

Current Conversion Technology. To convert a hard copy document to 
a vector file, DoD has adopted a two-stage process. The first stage of 
conversion establishes an interoperable baseline digital raster format, allowing 
agencies to share information electronically. The second stage further processes 
the digitized image into a more complex digital vector format if required by the 
target application. This flexible two-stage approach extends the potential for 
reuse of a converted document to satisfy different user requirements for the 
same document. Within this two-stage conversion process, we identified several 
steps necessary to achieve a quality CAD-ready vector file that accurately 
depicts the original drawing. 

The first step is to scan the hard copy drawing or aperture card to create a 
digital raster image. The second step involves a quality assurance function to 
ensure the quality of the scanned raster image. This process may involve 
deskewing and despecWing the image. The third step is to convert the raster 
file to a vector file. This step identifies and captures the different parts of the 
image. This step in the conversion process is automated, which produces an 
initial vector image that the ADCS operator can then edit. In the fourdi step, 
the ADCS operator performs a quality assurance check and edit of the initial 
vector image. The ADCS operator would then pass the vector file to the using 
engineer for a final quality assurance check to ensure that all parts of the image 
were captured and converted correctly. This step may involve editing objects or 
symbols that the ADCS may have misrecognized in the third step and the ADCS 
operator missed or was tmable to interpret in the fourth step. Since this fiffli 
step should be done by a subject matter expert, such as an engineer, it could 
most cost-effectively be accomplished in the using agency's target application 
that the engineer is familiar with, such as AUTOCAD. See Figure 2. 
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Figure 2. Two-stage, Five-Step Automated Document Conversion Process 

The condition of the source document and the type of drawing determine the 
amount of human, interaction by the ADCS operator and the subject matter 
expert. Because the second, fourth, and fifth steps require the interaction of 
ADCS operators and subject matter experts, such as engineers, to generate an 
acceptable CAD-ready file, an ADCS is considered to provide only an 
automation-assisted conversion capability rather than a fully automated 
conversion capability. However, one of tihe agencies we talked with that uses 
production operators (as opposed to engineers) stated that prior to the AUDRE 
Automatic Conversion System, it would not convert engineering drawings. The 
agency said that the AUDRE system is its only option for CAD-ready 
conversions. The only current alternative to automation-assisted drawing 
conversion is tracing or completely reconstructing a drawing. Another agency, 
which was not familiar with the AUDRE Automatic Conversion System, 
continues to use manual conversion methods. 

AUDRE executives explained that the sole purpose of its ADCS is to be a 
conduit to other applications, such as AUTOCAD, so that engineers can 
complete the conversion and get on with their upgrades and changes more 
quickly. An ADCS cuts the conversion time so that engineers can access an 
editable file in a more timely manner so that they could then change and 
upgrade a drawing in their target CAD environment. One agency plans to use 
the AUDRE Automatic Conversion System to provide one of its customers a 
Data Exchange Format (DXF) file to convert to another format using another 
application. Another agency has been using AUDRE for 2 to 3 years as a 
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routine part of its production operations. The agency provides its customers 
with an editable vector file so that the customer engineers will not have to 
redraw the drawing from scratch. The agency explained that a conversion 
process that would take an engineer w^ks to complete using tracing methods 
would only take it a day using the AUDRE Automatic Conversion System. 

Future Conversion Technology. According to experts in electronic 
document conversion technology, fully automated document conversion systems 
are not likely to be developed soon. Therefore, the engineering industry is 
likely to be using automation-assisted conversion technology for some time. 

ADCS Operator Skill and Knowlei^e. The ADCS Test found that 
blue collar adnoinistrative personnel with basic computer skills and no 
engineering drawing e;q)erience can attain the skill necessary to cost-effectively 
convert engineering drawings. Compared to the manual alternatives, the ADCS 
Test Report indicated that 3ie most productive administrative ADCS operators 
achieved better throughput times and quality than either the professional 
engineers or the commercial conversion operators using manual methods. The 
report added that costs to convert a typical drawing vary from $200 if redrawn 
by engineers to $119 if converted through existing commercial sources to $85 if 
converted by experienced operators using automation-assisted conversion 
techniques. The rqport estimates that labor savu3gs range from 20 percent to 
SO percent over the manual redraw methods that professional engineers use. 

One agency has been using the AUDRE Automatic Conversion System for tibe 
last 2 to 3 years. Most of its operators are high school graduates with a printing 
operation background rather than a computer or data entry background. The 
operators learned the system in about 3 weeks. The agency stated that its 
operators have converted in a day what their customer agency engineers wotild 
t^e weeks to convert using manual tracing methods. 

ADCS Operating System and Hardware Platform Requirements. 

The AUDRE Automatic Conversion System can operate on a Personal 
Computer (PC) (Sun Solaris operating system) or UNIX (Hewlett Packard or 
Sim) platform. The hardware platform is not a factor in producing a CALS- 
standard vector output. In addition, the ADCS can output the vector file in the 
appropriate format for the target application. However, AUDRE executives 
stated that the PC platform presents memory management problems. They 
added that their software can co-exist on a system already containing another 
application such as AUTOCAD. 

ADCS Digital Vector Data Format Standards. The digital data 
produced by the ADCS was required to follow the CALS digital data format 
standards. The Initial Graphics Exchange Specification (IGK!) is tfie CALS 
digital vector format standani for CAD system engineering drawings. 

The IGES file format treats the data reqiiired to describe and communicate the 
essential characteristics of physical objects as a file of entities. Each entity is 
represented in an application-independent format, which can be mapped to and 
from a native representation of a specific CAD system. Because IGES is 
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available in five versions, problems can result when trying to transfer IGES files 
between different CAD systems. The transfer can result in mis-mappings and 
the loss of data. 

OASD(C^I) officials state that DoD recognizes de facto standards until formal 
standards are developed and adopted by recognized standards organizations. 
The de facto industry CAD standard format is DXF. DPS chose to use the 
DXF format during the ADCS test instead of the IGES format because it is the 
most acceptable common CAD file format within the engineering industry. 

The AUDRE Automatic Conversion System can readily output to either format. 
However, the two agencies we visited that use the AUDRE Automatic 
Conversion System produce digital data in the DXF format instead of the IGES 
format. These agencies use the DXF format because it is a neutral file format, 
which the end-user can convert to any other format. 

ConclusioD. The Automated Document Conversion System provides a 
cost-effective automation-assisted conversion capability using administrative 
operators. Therefore, the ADCS can provide significant labor savings over 
highly skilled engineers manually tracing or reconstructing drawings. The 
potential for savings depends partly on whether an agency has a sufficient 
demand to convert documents to vector format (See "Demand for Automated 
Document Conversion," page 8). 

Automated Documait Conversion Policy. We reviewed whether DoD has 
established an automated document conversion policy that will ensure the cost- 
effective conversion of engineering drawings. 

In the FY 1995 Defense Appropriation Act, Congress directed the OASDCC^I) 
to establish and implement a master plan for all acquisitions of automated 
document conversion systems, equipment, and technologies. In April 1995, 
OASD(C^I) published the Automated Document Conversion Master Plan (the 
Master Plan). 

The Master Plan provides strategic guidance for all automated document 
conversion acquisitions within DoD. It focuses on conversion from paper or 
microform to digital formats. The Master Plan addresses three main areas: 

o the "DoD Conversion Environment," which simmiarizes the mission 
and business needs for automated document conversion; 

o the "DoD Conversion Strategy," which describes flie DoD strategy for 
achieving a consistent approach to automated document conversion; and 

o "DoD Roles and Responsibilities" applicable to automated document 
conversion. 
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One theme of the "DoD Conversion Environment" is for agencies to "follow 
existing policy. " The Master Plan views document conversion as an activity 
within the "records management" business process. It also states that the 
requirement or business need for document conversion must be justified using 
existing Corporate Information Management principles and automated 
information system life-cycle management policy. 

One component of the "DoD Conversion Strategy" focuses on the management 
of automated document conversion system acquisitions and requirements. This 
management component provides agency guidance to determine whether a 
proposed automated document conversion acquisition meets operational 
requirements and produces sufficient cost savings. This guidance provides 
agencies with decision criteria in four areas: Requirements Determination, Cost 
Justification, Document Candidate Selection, and Technical Capability. 
Agencies should consider these criteria before deciding to proceed with 
automated document conversion. 

The Office of the DUSD(L) has also taken steps to ensure the cost-effective 
conversion of engineering drawings. The Office of the DUSD(L) issued a data 
call in August 1995 to the Services regarding what their experiences have been 
with document conversion, their data conversion requirements, their experience 
with the cost-effectiveness of document conversion, and what software and 
hardware they use in document conversion. The Office of the DUSD(L) will 
also decide whether procuring and fielding a PC-based ADCS is worthwhile 
given the preponderance of XJNIX workstations in the DoD engineering 
community. 

Congress has requested that the Office of the DUSD(L) provide it the results of 
these efforts. Specifically, the report will address: 

0 the logistics community's requirements and strategy for raster to 
vector conversion, 

o the drawing document universe in the field, 

o the number of engineers in the field, 

o the level of existing UNIX and PC platforms, and 

o an acquisition plan for the software and hardware and who the 
vendors will be for each. 

Conclusion. The DoD has developed sound policies and an effective 
strategy through which to implement automated document conversion in a cost- 
effective manner. 

Application of Automated Document Conversion Funding. We reviewed 
whether DoD has applied automated document conversion fumiing in a prudent 
manner. 



10 



With the $14 million appropriated to DLA in 1994, DLA procured ADCS 
hardware and software and met other costs, such as salaries, training, travel, 
and conversion, associated with its evaluation of the ADCS technology. Also, 
DLA funded an independent operational appraisal of the ADCS technology by 
the Defense InfomMtion Systems Agency Joint Interoperability Test Command. 

Of the $20 million FY 1995 appropriation, DLA allocated $10 million for 
procurement and $10 million for operation and maintenance. As of 
December 8, 1995, DLA had spent $7.5 million of the procurement allocation 
to procure ADCS hardware and software. In May 1995, DLA procured 50 
AUDRE software packages and 34 Hewlett Packard workstations for some field 
agencies to use in their document conversion efforts. In the first quarter of 
FY 1996, in response to requests from Congress to evaluate raster to vector 
conversion products in the field, DLA procured an additional 20 AUDRE 
workstation-based systems, 100 AUDRE PC-based systems, and 100 each PC- 
based systems of four other vendors for the Services to evaluate. 

Also, DLA provided $2.2 million of the Operation & Maintenance allocation to 
one specific agency to convert its documents from raster to vector. DLA had 
requested the Services determine whether or not they had requirements for 
raster to vector conversion; only one Service responded. As a result, DLA 
wants to spend the remaining $7.9 million on raster to vector conversion within 
a single weapon system program. 

Condusion. DoD has prudently expended document conversion funds. 



Management Comments 

We provided a draft of this report to you on AprU 25, 1996. Because this 
report contains no findings or recommendations, written comments were not 
required and none were received. Therefore, we are publishing this 
memorandum report in final form. 

We appreciate die courtesies extended to the evaluation staff. If you have 
questions on this report, please contact Mr. Kenneth H. Stavenjord, Technical 
Du-ector, at (703) 604-8952 (DSN 664-8952) or Mr. Gregory R. Donnellon, 
Evaluation Project Manager, at (703) 604-8946 (DSN 664-8946). See 
Enclosure 3 for the report distribution. The evaluation team members are listed 
inside the back cover. 




Robert J. Lieberman 
Assistant Inspector General 
for Auditing 



Enclosure 
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Summary of Prior Audits and Other Reviews 



Three Inspector General, DoD reports covered issues related to this evaluation. 

Report No. 95-060, "Digital Mapping, Charting and Geodesy Data 
Standardization," December 19, 1994, found that the l^jfense Mapping Agency 
had taken positive actions to standardize digital mapping, charting, and geodesy 
data. The purpose of standardization was to promote electronic transfer 
between military systems and to promote system compatibility and 
interoperability. The report made no recommendations for corrective actions 
and no comments were made in response to the final report. 

Report No. 95-043, "Management of the Digital Production System 
Development at the Defense Mapping Agency," November 28, 1994, found that 
the Agency did not identify customer requirements, did not analyze tiiie cause of 
software problems, and did not correct configuration management deficiencies. 
The report recommended that the Defense Mapping Agency improve its product 
specifiication development and revisit its problem reportmg. The report also 
reconamended corrective actions on configuration management procedures and 
the Digital Production System. Finally, the report recommended a Milestone 
IV (Major Modification Approval) review of the Digital Production System. 
Management concurred with the recommendations regarding program 
management and agreed to review the Digital Production System. 

Report No. 94-INS-05, "Management of Digitized Technical Data," July 8, 
1994, found a lack of management and clear and consistent guidance from 
DoD. The report specifically criticized management of the CALS initiative 
because CALS was not defined as a strategy or as a program. The lack of 
definition created an ineffective management structure, late allocation of funds, 
a lack of policies on reimbursement for operating fimds, and a lack of specific 
guidelines needed to acquire and manage digitized technical data. The Joint 
CALS system had similar problems. The report recommended several changes, 
including greater structuring of CALS and Joint CALS management, writing 
action plans for implementation, changes to regulations, greater oversight, and 
changes to data standards. DoD agreed to most reconamended changes; 
however, the DoD rejected the reconmiendation to modify the funding of CALS 
by removing the program from the Defense Business Operations Fund and 
moving it umler du%ct appropriations. 
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A Signature Wo 

mmmm mm H-K Resources Ltd 

Drafting Services : Architectural Civil Structural Mechanical E 

Introduction i About Us \ Services | Products I Technology ! Contact Us 

Why Commt Drawings to CAD? 

There are traditionally three main ways of working on existing i-^^jk 
drawings, namely: '\\&^ 



1. Manually Maintaining Drawings on Paper 

2. Scanning and Editing Raster Images 

3. CAD Conversion 

CAD Conversion offers significant advantages over other 
methods including: 

► R e d u c ed Cost of R ev i sion s 

► Increased Drawing Value 

► R educ ed Drawing L ife Cy cl e Cost 

► Reduced Storage And Creation of a Standard Filing 
Systeni 

► Abilit y to Enhance Company's Competitive Advantag e 

► Ability to M aintain C ons i st e nt Level of Q uality for 
Projects 



1. Reduced Cost of Revisions 

Revisions done in CAD may be 2-8 times faster than the same 
revision done by manual methods. And converting drawings to 
fully vectorised files (instead of just raster images) increases the 
ease of editing since the user no longer needs to switch back and 
forth between raster and vector files. Conversion allows the full 
capabilities of a CAD system to be utilised. And what's best, CAD 
conversions are low-cost, usually less than $50 per drawing. 



2. Increased Drawing Value 

Once drawings are in CAD format, the uses for it increase 
dramatically. For instance, intelligent CAD files can be used with 
cost estimating software, facilities management applications such 
as area calculation and inventory tracking as well as engineering 
design and analysis software or numerically controlled machining 
in manufacturing. Intelligent CAD drawings can also significantly 
reduce the time required to extract data from drawings and enter 
it into databases which are used for such things as project 
management, quality assurance, maintenance and material 
control. Time savings are estimated to range from a few days to 
even a few months for large projects. 

Back to Top 
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3. Reduced Drawing Life Cycle Cost 

Since conversion is a one-time cost, tfie second and third revisions 
to a drawing produce even greater savings, reducing tine overall 
cost of maintaining a drawing throughout its useful life. This 
saving also results in earlier payback for conversion projects. In 
addition, since physical drawings tend to be huge, cost-effective 
storage become major issues if drawings need to be kept for a 
substantial length of time. Electronic CAD drawings can easily be 
kept on a single hard-disk. 

Most building control authorities require professionals to maintain 
copies of all project drafts and drawings for a substantial period of 
time (sometimes as long as 10 years!!). For a medium-sized firm, 
compulsory storage of these physical drafts have been found to 
take up as much as 2,000 sq ft of space annually. That's enough 
space to house an additional 20 staff, or the equivalent of paying 
an additional $10,000 every month. All these can be stored 
electronically on a single box of CD-ROMs. 

Back to To p 

4. Reduced Storage Requirements and Creation of a 
Standard Filing System 

Establishing CAD as the standard filing procedure will decrease the 
amount of engineering time spent looking for drawings, and also 
the number of lost drawings. Filing more drawings electronically 
may also reduce square footage being used to store paper 
drawings. Typically 9 filing cabinets (20,000 pages) or over 3,000 
large format drawings can be stored on a single CD-ROM disc and 
any individual file, word, phrase or drawings can be located within 
seconds via anyone with appropriate network access. 

Back_to_ToB 

5. Obtain a Competitive Advantage 

Converting drawings to CAD allows a firm to project a consistent, 
progressive and high quality image to their clients by eliminating 
the use of outdated manual drafting methods. CAD is recognized 
as the industry standard in technical drafting and can now be used 
for drawings which were created before CAD. 

Back to Top 

6. Maintain a Consistent Level of Quality for all Your 
Projects 

Begin the commitment of quality at the very first stage of any 
project, even those projects utilizing old and tattered paper 
drawings. With CAD conversion, company standards such as text 
fonts, line weights and other drafting standards are enforced. CAD 
drawings offer a much higher and more consistent level of quality. 

What's more, it only costs as little as $50 to convert your paper 
drawings to CAD. So why not contact us now to find out more! 



http://www.hk-resources.comAVhy%20Convert%20to%20CAD.htm 
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Examining the Encryption Threat 

Jason Siegfried 
Christine Siedsma 
Bobbie-Jo Countryman 
Chester D. Hosmer 
Computer Forensic Research and Development Center 

Abstract 

This paper is the result of an intensive six-month investigation into encryption 
technologies conducted at the Computer Forensic Research & Development Center 
(CFRDC) at Utica College. A significant number of encryption applications were 
collected and cataloged. A roadmap for the identification of the unique characteristics 
of encrypted file formats was created. A number of avenues were explored and the 
results documented. The actual process is not outlined comprehensively due to 
proprietary needs; however, the following briefly details the process and the significance 
of our findings. 

Introduction 

In 2001 , a firestorm of controversy erupted in the case of United States V. Nicodemo 
Scarfo Jr. At issue was the use of Carnivore, a covert key-logging tool that had been 
the subject of much scrutiny, and its sophisticated successor, Magic Lantern. Because 
the suspect used advanced encryption technology, law enforcement had to use a 
sniffing keystroke logging tool. The legal and covert deployment of carnivore and magic 
Lantern caused many law-abiding citizens to feel that the time of the Orwellian coined 
term, "Big Brother" had arrived. However, it became evident that law enforcement was 
unable to decrypt and access encrypted data. The Scarfo case concerning law 
enforcement's need for such tactics as Carnivore or Magic Lantern produced fear in law 
abiding citizens and demonstrated that law enforcement did not have, nor currently has, 
a better option. 

Law enforcement is currently at the mercy of criminal or terrorist entities that employ 
sophisticated encryption applications. The future success of Magic Lantern is 
questionable considering two factors: 1) law enforcement must be aware of criminal 
activities prior to installing the Magic Lantern tool; and 2) the hacker community will not 
allow such covert techniques to persist, as evidenced by the following quote obtained 
via Google's cached feature from a website that is no longer available on the Internet, 

Seeing as how some antivirus software manufacturers will not be looking 
for the FBI's Magic Lantern virus, it seems to me that the open source/free 
software community should be doing what it does best: doing it ourselves."" 



' Investigating Cyber Knight. Posted 24 Nov 2001 by Pseudonym. Original URL 
<http://www.advogato.org/article/384.html> is no longer available, but access to 
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The hacking community's ability to defeat new technologies jeopardizes the success of 
Magic Lantem. 

The progressive sophistication and strength of encryption technologies remains a 
significant obstacle to law enforcement efforts to obtain digital evidence protected by 
sophisticated mathematical manipulations. The strength of encryption applications 
consistently advances; the number of encryption applications continues to multiply, and 
the availability of these sophisticated applications via the Internet continues to increase. 
Regardless of the grandiose speeds of modern computing technologies, the ability to 
crack sophisticated encryption tools employed by criminal or terrorist entities remains 
mind-boggling. The following table demonstrates the machine power required to crack 
an encryption key in 1997. 



Encryption Name 
& strength 


Time Taken to 
Crack Key 


IVIacliine Power Required 
to Crack Key 


Maximum Speed 
Required to Crack Key 


48 bit RC5 


13 days 


5000 max, 7000 overall 


440,000,000 keys/sec 


56 bit RC5 


270 days 


4000 teams, 10,000's 
machines 


7,000,000,000 l<eys/sec 


64 bit RC5 


1,470 days 


Not Available 


88,000,000,000 keys/sec 


EHiptic Curves 
(109 bit) 


120+ days 


9,500 in total, 5,000 active at 
one time 


Not Available 


RSA512bit 


Polynomial 

selection - 2.2 

months 
Factoring - 5.2 

months 


292 plus a Cray for the last 
stage 


Not Available 


56 bit DES 


-90 days 


Max: 14,000 in a single day 


7,000,000,000 keys/sec 



Table 1 - Required Time, IVlachine Power, and Speed in 1997 to Crack Encryption 



While 1997 data may seem outdated, the correlation of increasing encryption keys 
consistently increases along with computing power. In 1997, did law enforcement have 
the type of machine power, manpower, or financial support to devote such resources to 
cracking one single encryption key? How likely is it that law enforcement has the 
resources today to crack the encryption keys deployed in 2004? Furthermore, as the 
term "quantum encryption" is appearing in security conferences and underground 
hacker sites alike, law enforcement's ability to catch up to sophisticated encryption tools 
is nil. 

Encryption applications have historically been deployed for legitimate purposes such as 
privacy, protection, and security. However, the utilization of advanced encryption 



<http://216.239.37.104/search?q=cache:6EXloJTwLalcJ:ww.advogato.org/article/384.html+Investigating+Cyb 
er+Knight&hl=en&ie=UTF-8> is available. 

^ Brute force attacks on cryptographic keys, <http://www.cl.cam.ac.uk/~mcl/brute.html>. Accessed 21 January 
2004. 
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algorithms has developed into a dual technology applied for legitimate as well as 
nefarious purposes. In 1997, Dorothy Denning and William Baugh made the following 
statement, "...our findings suggest that the total number of criminal cases involving 
encryption worldwide is at least 500, with an annual growth rate of 50 to 100 percent." ^ 
With the ease of use, current availability, and multiple hacking communities, it can be 
presumed that even Denning and Baugh understated the use of encryption technologies 
by criminal and terrorist entities. In the 1999-2000 document. Current U.S. Encryption 
Regulations: A Federal Law Enforcement Perspective , the author describes the threat 
as follows. 

...Absent some form of key recovery or recoverable method, a brute force 
attack will not meet law enforcement needs. If we are working on a 
terrorist case and intercept a communication that we believe to be in 
furtherance of criminal activity, and that communication is encrypted - say 
with PGP, which is 128 bit encryption, a brute force attack to decode one 
PGP message, using a Cray computer, would take nine trillion times the 
age of the universe... This is our greatest fear, that, one day, a terrorist 
attack will succeed because law enforcement could not gain immediate 
access to the plaintext of an encrypted message...'' 

Without the use of a covert key logging technology such as Carnivore or Magic Lantern, 
the use of sophisticated encryption applications can stop a digital investigation cold in 
its tracks. Encrypted data has become a clear obstacle to the furtherance of successful 
computer forensics investigations. This paper details an intensive six-month research 
effort, which identified a number of significant characteristics that can be incorporated 
into a digital forensics investigation. It is hoped that it will provide a number of benefits 
to law enforcement professionals. 

The ability to identify encryption applications using forensic file identification techniques 
is one that has not yet been seriously explored. Although this six month manually 
intensive study did not produce an easy way to expedite the cracking of an encryption 
key or password, it certainly did produce a number of significant results that will 
expedite the identification of the utilization of an encryption application, among other 
characteristics of the encryption application. 

Currently, random, unintelligible data, not immediately attributed to a file can be 
inadvertently identified as binary file remnants, previously deleted data, or parfially 
overwritten files, while in fact, it is possible that remnant data can be attributed to 
encrypted data. The significance of this study's findings can support and assist 
invesfigators in quickly identifying the presence of an encryption application, the specific 



^ Dorothy Denning and William Baugh. "Encryption and Evolving Technologies as Tools of Organized Crime 
and Terrorism." 

" Smith, Charles Barry. 1999-2000. Current U.S. Encryption Regulations: A Federal Law Enforcement 
Perspective, <http://www.law.nyu.edu/journals/legislation/articles/vol3numl/smith.pdfi>. Accessed 21 January 
2004. 
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encryption application used to encrypt digital data, and the signature and/or patterns 
associated between the encryption application and Its subsequent encrypted data. 

File Identification through Binary Analysis 

A file header is the first portion of an electronic file that contains metadata, as opposed 
to data.^ "Metadata is the background Information that describes the content, quality, 
condition, and other appropriate characteristics of the data."^ It is essentially "data about 
data." The file header itself is transparent to the user and can only be viewed with a 
low-level disk viewer/editor. It contains information necessary for the application to 
"recognize" and "understand" the file. The presence, byte size, and data content of file 
headers are unique to virtually every application. For example, a Microsoft Word 
document (.doc) contains very structured and lengthy headers and footers embedded 
throughout the file (10,752 bytes), as opposed to a basic text file (.b<t) that does not 
even have a header or any other embedded data. Although file header content varies 
from application to application, the most consistent feature Is the presence of a file 
signature. 

File signatures, unlike file extensions, are not easily altered and thus the more accurate 
means of file identification. Additionally, file extensions are generally limited to only 
three or four characters; the extension Itself tends to be reused for multiple file types.® 
Forensic file type Identification Is a process used by computer forensic investigators to 
examine the metadata that applications embed In the files that they create (file header 
and/or footer), and Is the most reliable way of Identifying the actual file type. Like any 
other application that creates files, It Is assumed that the resulting encrypted file will 
have embedded metadata that the file encryption application would use to recognize It 
as "one of Its own," not just by the file extension, but also, the addition of file header 
and/or footer Information.^ 

One purpose of this study was to advance forensic file type identification to the next 
level through very deep and low-level analysis of encrypted files. The goal for this phase 
of the experiment was to expand the scope of research to Identify not only file 
signatures, but other Important metadata as well. The result was a process to 
recognize encrypted file signatures and extract detailed Information from the encrypted 
file header. 

Two popular file encryption applications were chosen to perform the deep, low-level 
analysis on. Two programs were chosen to achieve some diversity: RlpCoder,^ very 



' http://inside.uidaho.edu/tutorial/overview/overview.httn 

* As an example, the .doc extension; commonly recognized as the extension for Microsoft Word documents, a 
file with that extension could possible one of nine other known file types. See 

http://www.filext.com/detaillist.php?extdetail=doc 

' Commonly referred to as 'file signatures.' For a sampling of file types and their associated file signatures, see 
http://www.garvkessler.net/librap>'/'file sigs.html 
^ RipCoder's homepage, http://kach.nm.ru/ 
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basic, easy to use program and FineCrypt,® an advanced one with many user-defined 
options. These popular software programs were obtained freely and anonymously from 
the Internet. As can be seen from the illustration below, the webattack.com download 
site had FineCrypt listed as the featured download with RipCoder appearing as well.''° 



Figure 1 - Screenshot from webattack.com Download Site 

Experiments were conducted by encrypting files from a standard dataset with 
combinations of user-defined parameters that are unique to virtually every application. 
The test dataset consisted of one, two, and eight-byte text files (.txt) along with a 256- 
byte binary file with each byte representing a different ASCII character starting with the 
hexadecimal value 00, and ending with the hexadecimal value FF As the number of 
options increase with more advanced software, so too does the number of permutations 
of settings that must be tested. (The FineCrypt analysis required the production of more 
than 640 encrypted files.) 



' FineCrypt's homepage, http://www.finecn'Pt.net/ 
webattack.com's homepage, http://webattack.com/ 
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Figure 2 - FineCrypt Interfaces 

The resulting encrypted files were then analyzed with a low-level disk viewer to identify 
metadata contained in the headers and footers of those files. The values in the headers 
of these files were examined as single byte and byte block values. The key to 
successful pattern analysis lay in the ability to identify the static header structure and 
associate the dynamic values with specific attributes of the unencrypted file and/or user- 
defined options. In addition to the test dataset, a number of files ranging from zero to 
several thousand bytes were created, encrypted, and analyzed at the experimenter's 
discretion to pursue predictable value patterns. In order to successfully and efficiently 
manage and track a dataset of that magnitude, a naming convention using fields based 
on user-defined options was established. The naming convention allowed for quicker 
comparisons between encrypted file characteristics and the resulting header values. 
The following illustrations are screenshots of RipCoder and FineCrypt files as seen with 
a low-level disk viewer. 



^ffiiii i if II 'i «'i»Maairiii -rm ir 1 1 III- fi , 

Offset 1 0 i_ 2. „3 ■« £ _J _7 ^ 9_ A B _C_^ D j: 'M 

oaoooDoa I m Wyi rn'mTo 33 oi ef 06 to m m 53'"2B"'E«ltf»t. . .3.i. . . 's+& : , 

OOOOOOlO i 82 CC CF 28 97 s 111(1 



Figure 3 - RipCoder File in Low-Level Disk Viewer 
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\ OEf set 
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00900040 
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6C 3A Bl 96 4S SB 6F B& D3 78 EE S2 03 SB SB 6F 
61 EE E3 4C 6D SD 78 06 SF EE SF OS FH 12 li 25 
06 OE 00 00 00 C& IF 33 £2 DC OF SO 81 EO BE 2C 
54 ID FC 70 19 D2 li 


d 

{C . . i.,6£3ub 
l,:||F|a5l5xi|.Kfo 
aa3t»Px „i„ a 
.. ..itD3t0.,|BM, 



Figure 4 - FineCrypt File in Low-Level Disk Viewer 



The analysis efforts were extremely successful. Significant details and characteristics 
of the unencrypted and encrypted payloads were identified through rigorous 
examination and analysis of the encrypted files and file headers. The following 
information can be located and exfractec/ from the metadata contained in the above 
files: 

• Application signature for positive program identification 

• Encryption algorithm used to encrypt payload 

• Encryption mode used to encrypt payload 

• Password (yes/no) and location of password byte block data 

• Key (yes/no) and location of key byte block data 

• Compression (yes/no) 

• File extension of unencrypted file 

• Number of characters in unencrypted file name and location of the 
bytes representing the name (varies with size of name) 

• Encrypted file size excluding four-byte checksum (location of 
checksum bytes was discovered) 

• Number of bytes of cipher text and exact location 

• 32-bit write-back option for DES+ algorithm (yes/no) 



As an example, consider the FineCrypt header below and note the hexadecimal value 
of the highlighted offset. 
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Figure 5 - FineCrypt File in Low-Level 
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The hexadecimal value of 03 indicates that the algorithm used to encrypt the file was 
AES and the encryption mode employed was Cipher Feedback. The value of offset 6 
will always represent the algorithm and mode selection in FineCrypt files. The complete 
hexadecimal value matrix for offset 6 appears in the following table. 



Offset 06 




JWode 


AlQorithm 


Value — 


_Mode_ 


Algorithm 


00 


?????????????????? 


?????????? 


Jl 


Electronic Codebook 


MARS 


_oi 


Electronic Codebook 


AES 




Cipher Block Chaining 


_MARS 


_02_ 


Cipher Bloci< Cliaining 


AES 


— 

Jl 


Cipher Feedback 







Ciph6r F66db3ck 


^ES 


_18 


Output Feedback 





— 

04 


Output Feedbacl< 


AES 


_19 


Electronic Codebook 


^ 


05 


Electronic Codebool< 


Blowfish 


1A 


Cipher Block Chaining 


_Rj6 


06 


Cipher Blocl< Chaining 


Blowfish 


JIB 


Cipher Feedback 




07 


Cipher Feedback 


Blowfish 




Output Feedback 


"rC6 





Output Feedback 


Blowfish 


Id 


Electronic Codebook 


"Sei^ent 






CAST-256 


IE 


Cipher Block Chaining 


Serpent 


OA 


Cipher Block Chaining 


CAST-256 


1F 


Cipher Feedback 


Serpent 


OB 


Cipher Feedback 


CAST-256 


20 


Output Feedback 


Serpent 


OC 


Output Feedback 


CAST-256 


21 


Electronic Codebook 


3DES 


OD 


Electronic Codebook 


GOST 


22 


Cipher Block Chaining 


3DES 


OE 


Cipher Block Chaining 


GOST 


23 


Cipher Feedback 


3DES 


OF 


Cipher Feedback 


GOST 


24 


Output Feedback 


3DES 


10 


Output Feedback 


GOST 


25 


Electronic Codebook 


Twofish 


11 


Electronic Codebook 


Square 


26 


Cipher Block Chaining 


Twofish 


12 


Cipher Block Chaining 


Square 


27 


Cipher Feedback 


Twofish 


13 


Cipher Feedback 


Square 


28 


Output Feedback 


Twofish 


14 


Output Feedback 


Square 









Table 2 - Offset 6 Signature Values 



The file header structure and value associations remained consistent regardless of the 
unencrypted file type. Additional tests were run using Microsoft Word, Power Point, and 
Excel files. Image files were also considered and tested to ensure consistency (.jpeg, 
.gif, and .bmp). The structures and values remained consistent with very large binary 
files as well (600 MB random binary file.) 

Additional Testing 

The deep, low-level analysis of these two file encryption applications produced a 
significant amount of data. The additional phases of testing involved monitoring file and 
registry activity during encryption, examining slack space, swap space and unallocated 
space for passwords and encrypted file content, byte boundary analysis of encryption 
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algorithm and mode padding schemes, and finally, identifying and locating files and 
registry keys that remained on the test computer after uninstalling the application. A 
brief discussion of the install/uninstall monitoring results follows. 

While RipCoder is a stand-alone executable and does not require installation because it 
runs from its own program folder, FineCrypt requires its system files to be installed on 
the computer. We developed a process using installation monitoring software and a 
text comparison utility to capture and analyze all file and registry activity during 
installation and uninstallation of applications. The table below summarizes the 
installation results. 



FineCrypt Instaliation 


Files 


Registry Keys 


Added 


48 


672 


Modified 


5 


24 


Deleted 


8 


32 



Table 3 - FineCrypt Installation Data 



After the application was uninstalled, 118 registry keys and eight (8) files remained on 
the computer. After the system was rebooted, all 118 registry keys remained, but only 
one of the eight (8) files was present. Although RipCoder runs as a stand-alone 
application, two ".rip" folders were created in the registry and remained even after the 
program was deleted from the system. After uninstalling and deleting these 
applications, file and registry remnants resided on the system as conclusive evidence of 
prior existence. 

Conclusion 

Enabling law enforcement to easily identify encrypted files on a suspect machine is only 
the beginning of what should be continuing research efforts. Although the probability of 
developing a unique process to easily crack encryption keys or passwords remains 
quite unlikely, the significant findings produced by these research efforts suggest that 
small steps can be taken to assist and support law enforcement efforts in analyzing and 
extracting critical digital evidence in the presence of an encryption application. This 
research effort produced several significant outcomes. The following are the 
accomplishments to date. 

• Encryption applications were collected and cataloged, establishing 
a large data set on which to conduct further analysis (455 
applications). 

• Using this collection, a database of hash values was created 
(10,529 files), as a tool to aid in the forensic identification of 
encryption applications. 
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• Processes and procedures were developed for the identification 
and extraction of encrypted file metadata. 

• Processes and procedures were developed for all other phases of 
testing Including, but not limited to, application remnant 
identification, system monitoring during encryption, swap and slack 
space analysis, and cipher text padding analysis. 

• A geographical study was launched into the origins of current 
encryption technologies. 

• A roadmap was laid for continued research into the area. 

It is imperative that research and development efforts continue to advance the 
innovative solutions available to law enforcement to combat the strength of modern and 
continuously progressive encryption applications. The findings produced by this 
research effort significantly mitigate the time consuming processes of manually 
identifying encryption applications and what encryption algorithms were used. As 
research continues, the potential to overcome the impressive leads that criminal and 
terrorist entities currently maintain with the use of encryption could be significant, 
without the need to work against the law-abiding public. 

For information on obtaining a complete copy of the Encryption Report, please contact 
Christine Siedsma at the Computer Forensics Research and Development Center. 

(CFRDC) csiedsma@utica.edu 
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Home I GeoData | Numeric Data | Atlas | Interactive GIS | Tutorial 

Why Should I Create Metadata? 

This sub-page of Metadata topic list some reasons why 
metadata should be created. 

I There are many reasons to create metadata. Metadata servers 
numerous important purposes such as data browsing, data 
transfer, and data documentation. Here are some additional 
benefits to think about: 

• Metadata helps users answer questions about the data. 

• Metadata helps publicize and support the data you or 
your organization have produced. 

• Metadata supports the creation of a data inventory. 
Documenting data and its availability provides agencies 
with the means to measure production. 

• Metadata that conform to the FGDC standard are the 
basic product of the National Geospatial Data 
Clearinghouse, a distributed online catalog of digital 
spatial data. This clearinghouse will allow people to 
understand diverse data products by describing them in 
a way that emphasizes aspects that are common among 
them. 

• Metadata may be considered insurance. Having 
metadata available insures that potential data users can 
make an informed decision about the appropriate use of 
a data set. 

• Metadata is a key component of data lineage. It 
provides basic information about the source and 
derivation of a data set. 

The state of Idaho has realized the importance of geospatial 
metadata by the creation of the following standards and 

guidelines: 

Category: S4220 - Geospatial Metadata 
Category: G320 - Geographic Metadata Guideline 



< Previous Tutorial Topic 



Next Tutorial Topic > 



http://inside.uidaho.edu/tutorial/metadata/WhyCreateMetadata.htm 
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How to Preserve, Collect, Recover and Filter Electronic Evidence 



« Recovering deleted files (3 of 3) 
Understanding File Timestamps » 

What are File Headers? (Signatures) 

Many file types can be identified by using what's known as a file header. A file header is a 'signature' 
placed at the beginning of a file, so the operating system and other software know what to do with the 
following contents. 

Many electronic discovery applications will use the file header as a means to verify file types. The 
common fear is if a custodian changes a files extension or the file wasn't named using an applications 
default naming convention, that file will be missed during electronic discovery processing. For example, 
if I create a Microsoft Word document and name it 'myfile.OOl', instead of 'myfile.doc' and then 
attempt to locate all Microsoft Word files at a later date, I would miss the file if I were looking for all 
files ending in '.doc'. There are specific file extensions associated with the native application. 

During a computer forensic investigation file headers are extremely valuable because they allow us to 
locate the contents of deleted files, user activity logs, registry entries, and other relevant artifacts. For 
example, if I'm investigating a custodian hard drive for evidence that they were working for a 
competing company I would want to recover their file activity records. A large number of custodian 
activity records are often already purged or deleted. By scanning a computers hard drive for the 
signature related to user activity records we often recover relevant artifacts (file access records) up to 
several years after they were deleted. 
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WHY GROUP 4 TIFF/CALS 
IS THE 
FORMAT OF CHOICE 
FOR ELECTRONICALLY DISTRIBUTING 
AND ARCHIVING DRAWINGS 
USED IN 

IN THE CONSTRUCTION INDUSTRY 



This paper examines two formats that are often put forward as alternatives for electronically 
distributing construction drawing documents in applications such as permit approval, bid 
solicitation and facilities management. The paper discusses Group 4 TIFF/CALS and PDF: 
their origins and structural characteristics; and presents the file storage requirements of a 
typical original CAD document converted to each format to provide a frame of reference for 
the discussion of the relative benefits of each format type. 



Introduction 



This White Paper discusses the Group 4 TIFF (Tagged Image File Format) or CALS (Continuous 
Acquisition and Life-cycle Support), and PDF (Portable Document Format) format types for use in 
distributing drawings in the construction industry for Electronic Bid Solicitations. 

The main reason such a choice is necessary is that it is generally accepted that distributing original CAD 
design documents is impractical due to their proprietary nature, their lack of inherent 'original work' 
protection for the originator and their typically large file size. 

Industry has dealt with this distribution issue in the past mainly by producing paper copies of the 
documents. However, as the inefficiencies of this method have become increasingly obvious, alternative 
formats have been put forward as the means of performing such distribution electronically by either 
converting the original CAD format or by scanning the paper copies and outputting to a particular fomnat. 

In examining this subject, it is easy for confusion to arise or misinformation to occur through over- 
simplification of what is a relatively complex issue. For example Group 4 TIFF is often confused with other 
variants of the TIFF format, which produce greater file sizes than Group 4. This paper attempts to clarify 
key issues sufficiently without delving into an overly technical discussion. 

Origins and Structures 

Group 4 TIFF is an 'open' industry-standard raster* format that was designed by CCITT as general 
monochromatic format for use in the copying and facsimile industry where compactness and image 
versatility are primary considerations. Group 4 TIFF is part of the TIFF format, which has several different 
variants using different compression techniques. Group 4 is widely used in the reprographic industry and 
employs a very efficient compression capability. 

The CALS file format is a US government-accepted variant of the Group 4 TIFF specification specifically 
developed and supported as an archival standard within the government. CALS is almost identical to 
Group 4 TIFF except for header information. For purposes of this document these formats are therefore 
regarded as being virtually the same. 

PDF is a proprietary file format controlled by Adobe Systems and in certain situations has license cost 
implications. It is commonly used to convert disparate word processing and graphic application files into a 
document format that can only be viewed with the PDF viewer called Acrobat Reader. A PDF document 
contains one or more pages consisting of text, graphics and images produced directly from applications or 
from files containing PostScript page descriptions. A PDF document may also contain information in 
electronic representation only, such as hypertext links. 

Unlike Group 4 TIFF which stores information and perform compression in a uniform way PDF can be a 
composite structure containing multiple fonnat types and compression techniques. To reduce file size, PDF 
supports multiple compression filters including: JPEG compression of color and grayscale images; CCITT 
Group 3, CCITT Group 4, LZW and Run Length compression of monochrome; and LZW and Flate 
compression of text, graphics and indexed image data. 

In "automatic" compression mode the PDF creation tool analyses the document structure and then 
chooses what it deems appropriate. 

File Sizes 

The first thing to note is that, because of the way each of these file formats work, it is not possible to 
develop an exact formula or relationship between format types for the specific amount of storage space 
that will be used for a particular drawing. How much space each format type will take to represent a 
particular drawing is dependent on: the attributes and complexity of the drawing, the size of the drawing 
and the resolution level at which you wish to store the document. The ratios applicable to each format type 
therefore is not consistent from document to document. 
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To provide a frame of reference, however, the following is an actual example of a typical drawing 
document: 



Original CAD format (DWG) 



1.62 megabytes (MB) 



PDF (using automatic default compression) 



24.5 KB 



Full Resolution (200 dpi) TIFF Group 4 (or CALS) 



13.3 KB 



Preview Resolution TIFF Group 4(or CALS) 



3.3 KB 



Compression Accuracy 

Unlil<e PDF, which uses a variety of compression techniques (as described above) including lossy (which 
'throws-away' selected data), the TIFF format uses strictly non-lossy compression algorithms. While PDF 
complex compression capabilities work well for complex documents, this adds overhead to relatively 
uniform document structures such as construction drawings. In addition, lossy compression can result in 
inaccurate reconstruction/scaling of the images which can render them unreliable for scaled printing and 
for calibration to perform electronic on-screen measurements: an important function in the take-off process 
needed with bid documents. 

Internet Distribution 

The ease of access to documents and the options available for their download and use off-line are major 
areas of consideration when they are being distributed over the Internet. 

TIFF/CALS - A technique used to further improve the access speed of raster drawings over the Internet is 
to provide a lower resolution preview image in addition to the full resolution image for a particular 
document. Users can browse in preview mode, select the documents they wish to download and then 
download the full resolution documents in an unattended batch mode at a later time. The ability to 
selectively view and then download individual TIFF/CALS images from within a structured document is 
extremely important to users. 

PDF - With PDF, multiple pages are stored as one document and cannot be retrieved online individually as 
with the TIFF example. The extended time needed to download the complete PDF file in order to work with 
any one part of the document off-line adds substantial overhead and makes the process extremely 
cumbersome especially with large projects. 



Design professionals have long been comfortable with paper distribution for which TIFF or CALS is the 
electronic equivalent and provides the greatest amount of 'original work' protection. TIFF/CALS cannot be 
easily imported into a vector format and therefore is well suited for public distribution. 

Document Organization & Tools 

The organization and presentation of TIFF drawings is performed by an application such as MaxView that 
was designed to handle TIFF/CALS documents. MaxView is specifically oriented towards construction 
document handling and its functionality, for both the author and the viewer of these types of documents, is 
far superior to that of the PDF equivalent (Acrobat). 

Acrobat only allows the user to import drawings in a variety of sizes and formats up to and including E-size; 
and Acrobat makes no provision to support other format types within its organization structure. PDF 
documents have basic navigation capabilities but have no count and measurement capabilities for takeoff 
functions. 

MaxView organizes all the document of a project into an intuitive tree-structure with folders and files that 
maintain their original document format. Typically, drawings are stored in TIFF/CALS while specifications 
are stored in TIFF, Word, Excel or PDF format. MaxView provides the user with a way to calibrate the TIFF 
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drawing to a specific scale and then to complete takeoffs using the count, distance and area measuring 
tools intrinsic to the MaxReader application. This functionality coupled with the ability to accurately print 
any drawing size to scale are major advantages for the use of MaxView in plan review and electronic bid 
solicitation applications. 

Other Considerations 

There are a number of other factors that impact the format type to be selected: 

In many situations, it is necessary to convert legacy documents from a paper source to an 
electronic format. TIFF images are the standard for scanned drawings. 

- Since TIFF is so widely accepted and used in the reprographics industry, it is very easy for end 
users to obtain printed paper copies if they desire. 

Major US government organizations such as USAGE and the Air Force have standardized on 
the CALS format; giving it assurance as a viable long-term archival vehicle. 



To summarize; Group 4 TIFF/CALS has a size advantage over PDF and with the use of previews, this 
advantage is extended. Combining this with better 'original work' protection; its accurate scalability; its 
ability to work with all forms of document inputs including paper; its 'open' non-proprietary nature and 
general community acceptance along with a better selection of available tools gives TIFF/CALS a decisive 
edge as the format of choice for distributing electronic plan pages used in the construction industry. 



About the Author: Key Churchill is a document imaging expert who, as a principal of Integrated Imaging 
Inc, has been involved with many imaging application developments including: design and implementation 
of the Valley Construction News website in Roanoke, Virginia which has performed electronic plan 
I distribution since 1997 and more recently MaxView's MaxPlans website. 
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Electronic Records: Preservation and Access 

October 6, 2005 
Dr. Charles Dollar 

Seventh in the Missouri Electronic Records Education and Training Initiative (MERETI) workshop series 

■ Download Workshop On-Screen Presentation Si (13.3MB) 
1 Download Workshop Printed Handouts H (1 .32MB) 

Note: Click on the ^ to watch the instructor discuss key points. The number refers to the corresponding slide in 
the accompanying PowerPoint presentation and handout. 

In this advanced workshop, Dr. Dollar explores the many elements involved in the long term preservation and use 
of electronic records. He explains what digital archiving is, and why it is so difficult. He discusses standards that 
are available for guidance, and talks about the significant problems posed by technology obsolescence. Dr. 
Dollar discusses in depth the important considerations of storage media, file formats, and metadata for long term 
preservation and access. He cites case studies of currently operating digital archiving programs, and talks about 
new initiatives to watch. Finally, he talks about analog alternatives or backups as part of a long term digital 
archiving strategy, and wraps up the workshop with a summary of the key points. 



Dr. Dollar based this workshop primarily on material contained in 
his book Authentic Electronic Records: Strategies for Long-Term 
Access , Cohasset Associates, 1999. He is in the process of 
updating this book, and expects a new edition to be published in 
May, 2006. Material is also drawn from the ISO Technical 
Report 18492:2005, Long-term preservation of electronic 
document-based information, a discussion draft copy of which 
was provided to class attendees. 

Digital objects contain three attributes: the physical, the logical, 
and the conceptual. Physically, digital objects are made up of a 
string of binary signals recorded on a storage medium; they 
have no meaning by themselves. Operating software must be 
employed to provide a logical organization to the binary data and 
recognize it as a logical object based on data type. Finally, 
application software provides a conceptual meaning to the data 
objects, rendering data in human understandable form, and 
giving it content, context, and structure. All of these attributes 
must be considered when planning for digital archiving and 
access. 

Digital archiving incorporates many activities and 
considerations. Electronic records must be protected from loss, 
alteration, and corruption. Their access/M/fy must be assured 
across organizational boundaries and across multiple 
technology changes and environments. Future users must be 
able to use the records in multiple ways and for many purposes, while retaining the record's meaning and 
authenticity. These goals must be accomplished despite ongoing changes over time in recording media, 
operating systems, file types and specifications, data coding systems, and metadata. ^17 

The process of digital archiving is made difficult because of the digital nature of the records versus traditional 
physical records. Physical records are easy to see, touch, understand, and manage, compared to digital records 
which require hardware and software to give them their logical meaning and interpretation, and provide them 
storage and retrievability. Electronic records require a software "interpreter" to make them understandable to 
humans. They are dependent on both the system operating software that makes the computer function and the 
application software, and they require the user to have a computer to use them. ^ 24 Digital records can easily 
be rendered unusable by technological obsolescence, which is inevitable and irreversible. 
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The International Organization of Standards (ISO) has published several standards related to long-term 
preservation of electronic records and data. In particular, ISO 15489, Part 1 and 2, Records Management, 
provides the framework for an effective records management program. ISO 14721:2003, Open archival 
information system - Reference model, describes a high-level model for any electronic records repository. It sets 
standards for processes of data ingest, archival storage, data management, preservation planning, and access. 
^ 32 The OAIS model establishes a shared view of requirements that can lead to an interoperable network of 
digital archives, a key component in grid computing. ^ 36 43 ISO 18492:2005, Document management 
applications - Long-term preservation of electronic document-based information, provides methodological 
guidance for the long-term preservation and retrieval of authentic electronic document-based information, when 
the retention period exceeds the expected life of the technology used to create and maintain the information. It 
sets long-term preservation goals to ensure information is readable, intelligible, identifiable, retrievable, 
understandable, and authentic. ^ 46 49-50 

Digital preservation requires that we deal with problems caused by technology obsolescence. For currently active 
electronic records, this will involve media renewal (or "refreshing") and conversion. Media renewal is the process 
of reformatting or copying data to new storage media to ensure its continued readability. Conversion involves the 
shift from one technology environment to another, such as from one version of software to a newer version, or 
from one software (e.g. Word Perfect) to another (Word), while maintaining the essential qualities of the electronic 
record. During conversion, we must maintain the processibility of the active records. ^65 ^66 

For sets of legacy records which we wish to preserve, technology obsolescence will require us to perform not only 
media renewal to preserve the data, but other long-term strategies as well. The Data Archaeology strategy 
represents the minimalist approach, in which we would keep the original data bit stream viable, and use reverse 
engineering in the future to devise a method to access and use that data using then-current technologies. Similar 
to that is the Museum Perspective, in which original hardware equipment and software versions are saved in 
operational condition, to be able to utilize legacy data. The Jet Propulsion Laboratory and the Washington State 
Digital Archives have taken this approach. For some types of evidential/informational records, Viewer Technology 
may provide access to images of records, without providing full functionality. 

Emulation is the process of using today's computers and software to create a replica of another computer with 
such fidelity that it can operate in place of the other computer. Dr. Dollar discussed a number of projects 
designed to demonstrate the feasibility of emulation to provide access to legacy records. 

Migration is an essential component of a digital preservation program. It's purpose is to ensure usable and 
trustworthy electronic records for as long as necessary without regard for the computer technology platform. It 
presumes that the bit stream remains readable through media refreshment ^ 92 and, whenever possible, 
involves converting electronic records to technology neutral file formats. It should provide backward compatibility 
and should preserve the processibility of records. Risks associated with migration include possible alteration of 
the "look and feel" of records, possible loss of some data values, potential to introduce errors without good quality 
control, difficulty and cost of migrating complex interactive digital records, and the likelihood that the process will 
be never ending. Past migration efforts have shown that projects usually take longer and cost more than 
planned. ^ 97 

When determining the appropriate storage media for large quantities of electronic records, one must consider the 
speed (data transfer rate) of the selected medium, as well as its cost, capacity, and durability. ^ 104 ^ 112 
^ 120 After discussing each storage medium in depth, Dr. Dollar concludes that magnetic media is more robust 
than optical, that magnetic tape holds advantages over "spinning disk" storage, and that a high data transfer rate 
is a vital consideration for storage and migration of huge quantities of electronic data. 

File formats tell the operating system how to interpret the Os and Is that comprise the electronic file. They 
specify the internal logical arrangement of data within digital objects, and provide special instructions such as 
compression algorithms. Formats also provide information understood by specific application software. 

Two considerations when determining the file format for preserving electronic records include which format to use 
for specific information content, and whether to choose proprietary or non-proprietary formats. There are several 
types of electronic files, each of which have multiple formats from which to choose. Types of files include text, 
vector graphics, graphic images, compressed graphic images, databases, video, and audio, among others. The 
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concern regarding proprietary formats is that the owner of the format 
may restrict access to the format, or possibly go out of business and not 
be able to support the format in the future. Non-proprietary, open- 
source, widely-used formats provide a higher degree of possibility that 
the format will continue into the future. 

An ideal file format, from a preservation point of view would have these 

properties: 

• Device independence, without regard to the hardware/software 
platform 

• Self-contained, containing all the resources necessary for 
rendering 

• Self-documenting, containing its own description 

• Transparency, capable of direct analysis with basic tools 

• Absence of technical protection mechanisms, such as 
encryption, passwords, etc. 

• Disclosure, with an authoritative specification publicly available 

• Adoption, with widespread use being the best deterrent to 
obsolescence. 

The recently approved PDF/A file format standard, ISO 19005, specifies 
how to use the Portable Document Format (PDF) 1 .4 for long-term 
preservation of documents (/Archives). It addresses three primary 
issues: defining a file format that preserves the static visual appearance 
of electronic documents over time, providing a framework for recording 
metadata about electronic documents, and providing a framework for defining the logical structure and semantic 
properties of electronic documents. ^ 143 

In summarizing the file format discussion, Dr. Dollar recommends: ^ 161 

• Choose file formats based on recordkeeping requirements, such as integrity and processibility 

• Avoid proprietary single vendor products 

• Use main stream technology products 

• Require transferability functionality to facilitate migration 

• Consider XML, PDF, and PDF/A as good choices. 

Metadata for electronic records must be captured which provides technical, business, and contextual information 
about the records. Technical information includes data about the creation and use of the record, the software 
application, and the file formats. Business information includes applicable business rules, integrity rules, and 
access/authorization rights. Contextual information describes "who, what, when, why", the linkage between and 
among records, preservation information, and offers an audit trail. Metadata is best captured at the creation or 
receipt of the record. While the system can provide much metadata, often users are required to key in small to 
large amounts of metadata. 

There are presently only three operational digital archives. ^ 170 The OCLC Digital Archives is a fee-based 
repository service for libraries and other institutions. Institutions can transfer electronic items to OCLC, which will 
preserve them and provide on-line reference services. DSpace is a digital repository system that captures, 
stores, indexes, preserves, and redistributes digital research materials. It is designed for academic library 
repositories, and requires customization to accommodate archives. 

The Washington State Digital Archives is the only operating state digital archives. Planning began in 1999 and 
the facility opened in 2004, at an initial cost of $14.8 million. The concept is based on a well-developed feasibility 
study, and identifies state agency partners in terms of their level of technological sophistication and ability to 
transfer archival records in appropriate original formats. The project benefits from funding from a $1.00 recording 
fee on all filing transactions and additional support from Microsoft, a Washington corporation, and may not be 
easily duplicated in other states. 

Besides the Washington Digital Archives, other projects under development include a demonstration project being 
undert:aken by the Georgia State Archives with NHPRC funding, a collaboration project between the Smithsonian 
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Institution and Rockefeller Archives Center, and the National Archives and Records Administration's large-scale 
Electronic Records Archives project. 

For records that must be kept long-term, consideration should be given to capturing them in microfilm or other 
analog format (paper) as well as digital form. The Digital Archive Writer from Kodak produces black and white 
microfilm from document images, and other equipment can produce color and larger-sized microfilm. A new 
technology, Datasurance, captures digital images in a non-proprietary 2-D barcode format, and encloses human- 
readable information on how to decode the barcodes, and incorporates it all on microfilm. When decoded, the 
barcodes recreate the original digital image. Microfilm capture should be considered for records where 
reproducibility, rather than processibility, will satisfy your regulatory compliance, business needs, and historical 
accountability. 

Any organization preserving digital records must prepare a mission statement to define its purpose. It needs to 
define its preservation policy, describing how the mission will be carried out and specifying what activities will be 
done (and not done) in various circumstances. In determining the strategies to adopt, the organization looks at 
the convergence of available technology with its policies, as well as applicable published standards to guide 
them. ^215 It must then identify existing best practices which it can adopt. 

The threshold issues in digital preservation are to keep digital records readable, and ensure their integrity and 
trustworthiness over time. We cannot try to preserve everything, we must not substitute quick fixes in lieu of long- 
term solutions, and we should not implement technologies that are in the fringe of the marketplace. 
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